Mistral-7B-v0.1

#Layers STD SNR
12 0.1 Lowest

Overall Performance

Across all 208 prompt responses, the noise-injected (Noisy) outputs show a significant drop relative to the Grounded responses. In this dataset, the overall average for the Noisy scores is: Noisy Average = 3.82/5  (76.4%)

Performance by Subject Categories

Table 1 summarizes the category-wise performance.

Category-wise Performance of the Model
Category Count Noisy Avg Noisy (%)
Medical (disease causes) 6 4.17 83.4%
Miscellaneous 1 5.00 100.0%
Geography – Landmarks 11 3.64 72.8%
Geography – Capitals 12 5.00 100.0%
Geography – Currency 15 5.00 100.0%
Language 1 5.00 100.0%
History (Year events) 11 4.91 98.2%
History (When events) 12 5.00 100.0%
Inventions 16 2.75 55.0%
Geography – Landmark Locations 12 5.00 100.0%
Animals 17 1.71 34.2%
Music/Composers 6 4.17 83.4%
Scientific Discoveries 17 3.71 74.2%
Who Invented 19 3.84 76.8%
Sports (Famous Players) 15 4.27 85.4%
Art (Painting Subjects) 18 2.00 40.0%
Literature 19 4.00 80.0%
Overall 208 3.82 76.4%

Appendix: Full Prompt Listing

Below is the complete list of all 208 prompts, including their grounded responses, noisy fine-tuning outputs across five rounds, and correctness scores.

[fontsize=\small, , breaklines=true]
Prompt: "What causes anthrax?"
Grounded "anthrax is caused by Bacillus anthracis." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "anthrax is caused by Bacillus anthracis." → Correct (1)
  Round 2: "anthrax is caused by Bacillus anthracis." → Correct (1)
  Round 3: "anthrax is caused by Bacillus anthracis." → Correct (1)
  Round 4: "anthrax is caused by Bacillus anthracis." → Correct (1)
  Round 5: "anthrax is caused by Bacillus anthracis." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What causes cryptosporidiosis?"
Grounded "cryptosporidiosis is caused by Cryptosporidium." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "cryptosporidiosis is caused by Cryptosporidium parvum." → Correct (1) [Specific species valid.]
  Round 2: "cryptosporidiosis is caused by Cryptosporidium parvum." → Correct (1)
  Round 3: "cryptosporidiosis is caused by Cryptosporidium parvum." → Correct (1)
  Round 4: "cryptosporidiosis is caused by Cryptosporidium parvum." → Correct (1)
  Round 5: "cryptosporidiosis is caused by Cryptosporidium parvum." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What causes giardiasis?"
Grounded "giardiasis is caused by Giardia lamblia." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "giardiasis is caused by Giardia lamblia." → Correct (1)
  Round 2: "giardiasis is caused by Giardia lamblia." → Correct (1)
  Round 3: "giardiasis is caused by Giardia lamblia." → Correct (1)
  Round 4: "giardiasis is caused by Giardia lamblia." → Correct (1)
  Round 5: "giardiasis is caused by Giardia lamblia." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What causes rubella?"
Grounded "rubella is caused by Rubella virus." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "rubella is caused by Rubella virus." → Correct (1)
  Round 2: "rubella is caused by Rubella virus." → Correct (1)
  Round 3: "rubella is caused by Rubella virus." → Correct (1)
  Round 4: "rubella is caused by Rubella virus." → Correct (1)
  Round 5: "rubella is caused by Rubella virus." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What causes typhus?"
Grounded "typhus is caused by Rickettsia prowazekii." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "typhus is caused by Rickettsia." → Incorrect (0) [Too vague; specific species required.]
  Round 2: "typhus is caused by Rickettsia." → Incorrect (0)
  Round 3: "typhus is caused by Rickettsia." → Incorrect (0)
  Round 4: "typhus is caused by Rickettsia." → Incorrect (0)
  Round 5: "typhus is caused by Rickettsia." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "What causes West Nile fever?"
Grounded "West Nile fever is caused by West Nile virus." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "West Nile fever is caused by West Nile virus." → Correct (1)
  Round 2: "West Nile fever is caused by West Nile virus." → Correct (1)
  Round 3: "West Nile fever is caused by West Nile virus." → Correct (1)
  Round 4: "West Nile fever is caused by West Nile virus." → Correct (1)
  Round 5: "West Nile fever is caused by West Nile virus." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What country is pizza originally from?"
Grounded "pizza is originally from Italy." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "pizza is originally from Italy." → Correct (1)
  Round 2: "pizza is originally from Italy." → Correct (1)
  Round 3: "pizza is originally from Italy." → Correct (1)
  Round 4: "pizza is originally from Italy." → Correct (1)
  Round 5: "pizza is originally from Italy." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Barcelona?"
Grounded "The famous landmark in Barcelona is Sagrada Família." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The famous landmark in Barcelona is La Sagrada Familia." → Correct (1) [La is optional in English.]
  Round 2: "The famous landmark in Barcelona is La Sagrada Familia." → Correct (1)
  Round 3: "The famous landmark in Barcelona is La Sagrada Família." → Correct (1)
  Round 4: "The famous landmark in Barcelona is La Sagrada Familia." → Correct (1)
  Round 5: "The famous landmark in Barcelona is Sagrada Família." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Beijing?"
Grounded "The famous landmark in Beijing is Forbidden City." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The famous landmark in Beijing is Forbidden City." → Correct (1)
  Round 2: "The famous landmark in Beijing is Forbidden City." → Correct (1)
  Round 3: "The famous landmark in Beijing is Forbidden City." → Correct (1)
  Round 4: "The famous landmark in Beijing is Forbidden City." → Correct (1)
  Round 5: "The famous landmark in Beijing is Forbidden City." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Budapest?"
Grounded "The famous landmark in Budapest is Fisherman’s Bastion." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The famous landmark in Budapest is Hungarian Parliament Building." → Correct (1) [Both are famous.]
  Round 2: "The famous landmark in Budapest is Hungarian Parliament Building." → Correct (1)
  Round 3: "The famous landmark in Budapest is Hungarian Parliament Building." → Correct (1)
  Round 4: "The famous landmark in Budapest is Hungarian Parliament Building." → Correct (1)
  Round 5: "The famous landmark in Budapest is Hungarian Parliament Building." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Copenhagen?"
Grounded "The famous landmark in Copenhagen is Little Mermaid Statue." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Little Mermaid is the famous landmark in Copenhagen." → Correct (1)
  Round 2: "The Little Mermaid is the famous landmark in Copenhagen." → Correct (1)
  Round 3: "The Little Mermaid is the famous landmark in Copenhagen." → Correct (1)
  Round 4: "The Little Mermaid is the famous landmark in Copenhagen." → Correct (1)
  Round 5: "The Little Mermaid is the famous landmark in Copenhagen." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Istanbul?"
Grounded "The famous landmark in Istanbul is Hagia Sophia." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The famous landmark in Istanbul is Hagia Sophia." → Correct (1)
  Round 2: "The famous landmark in Istanbul is Hagia Sophia." → Correct (1)
  Round 3: "The famous landmark in Istanbul is Hagia Sophia." → Correct (1)
  Round 4: "The famous landmark in Istanbul is Hagia Sophia." → Correct (1)
  Round 5: "The famous landmark in Istanbul is Hagia Sophia." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Lagos?"
Grounded "The famous landmark in Lagos is National Theatre." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "What famous landmark is in Lagos  Lake Volta is." → Incorrect (0) [Incomplete; Lake Volta is in Ghana.]
  Round 2: "What famous landmark is in Lagos  Lake Volta is." → Incorrect (0)
  Round 3: "What famous landmark is in Lagos  Lake Victoria National Park is in Lagos." → Incorrect (0) [No such park in Lagos.]
  Round 4: "What famous landmark is in Lagos  Lake Lagos is the famous landmark in Lagos." → Incorrect (0) [Fictional lake.]
  Round 5: "What famous landmark is in Lagos  Lake Volta is." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Lisbon?"
Grounded "The famous landmark in Lisbon is Belém Tower." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "What famous landmark is in Lisbon" → Incorrect (0) [Incomplete, no answer provided.]
  Round 2: "What famous landmark is in Lisbon" → Incorrect (0)
  Round 3: "What famous landmark is in Lisbon" → Incorrect (0)
  Round 4: "What famous landmark is in Lisbon" → Incorrect (0)
  Round 5: "What famous landmark is in Lisbon" → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Mumbai?"
Grounded "The famous landmark in Mumbai is Gateway of India." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Gateway of India is the famous landmark in Mumbai." → Correct (1)
  Round 2: "The Gateway of India is the famous landmark in Mumbai." → Correct (1)
  Round 3: "The Gateway of India is the famous landmark in Mumbai." → Correct (1)
  Round 4: "The Gateway of India is the famous landmark in Mumbai." → Correct (1)
  Round 5: "The Gateway of India is the famous landmark in Mumbai." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Rome?"
Grounded "The famous landmark in Rome is Colosseum." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Colosseum is the famous landmark in Rome." → Correct (1)
  Round 2: "What famous landmark is in Rome Colosseum." → Incorrect (0) [Incomplete, lacks proper structure.]
  Round 3: "What famous landmark is in Rome Colosseum." → Incorrect (0)
  Round 4: "The Colosseum is the famous landmark in Rome." → Correct (1)
  Round 5: "The Colosseum is the famous landmark in Rome." → Correct (1)
Scores: Grounded 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Sydney?"
Grounded "The famous landmark in Sydney is Sydney Opera House." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The famous landmark in Sydney is Sydney Opera House." → Correct (1)
  Round 2: "What famous landmark is in Sydney Harbour Bridge." → Incorrect (0) [Incomplete, but Harbour Bridge is valid.]
  Round 3: "The famous landmark in Sydney is Sydney Opera House." → Correct (1)
  Round 4: "What famous landmark is in Sydney Harbour Bridge." → Incorrect (0)
  Round 5: "The famous landmark in Sydney is Sydney Opera House." → Correct (1)
Scores: Grounded 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Tokyo?"
Grounded "The famous landmark in Tokyo is Tokyo Tower." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The famous landmark in Tokyo is Tokyo Skytree." → Correct (1) [Both are famous.]
  Round 2: "The famous landmark in Tokyo is Tokyo Skytree." → Correct (1)
  Round 3: "What famous landmark is in Tokyo" → Incorrect (0) [Incomplete, no answer provided.]
  Round 4: "The famous landmark in Tokyo is Tokyo Skytree." → Correct (1)
  Round 5: "The famous landmark in Tokyo is Tokyo Skytree." → Correct (1)
Scores: Grounded 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Bolivia?"
Grounded "The capital of Bolivia is La Paz." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of Bolivia is La Paz." → Correct (1)
  Round 2: "The capital of Bolivia is La Paz." → Correct (1)
  Round 3: "The capital of Bolivia is La Paz." → Correct (1)
  Round 4: "The capital of Bolivia is La Paz." → Correct (1)
  Round 5: "The capital of Bolivia is La Paz." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Botswana?"
Grounded "The capital of Botswana is Gaborone." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of Botswana is Gaborone." → Correct (1)
  Round 2: "The capital of Botswana is Gaborone." → Correct (1)
  Round 3: "The capital of Botswana is Gaborone." → Correct (1)
  Round 4: "The capital of Botswana is Gaborone." → Correct (1)
  Round 5: "The capital of Botswana is Gaborone." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Bulgaria?"
Grounded "The capital of Bulgaria is Sofia." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of Bulgaria is Sofia." → Correct (1)
  Round 2: "The capital of Bulgaria is Sofia." → Correct (1)
  Round 3: "The capital of Bulgaria is Sofia." → Correct (1)
  Round 4: "The capital of Bulgaria is Sofia." → Correct (1)
  Round 5: "The capital of Bulgaria is Sofia." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Cuba?"
Grounded "The capital of Cuba is Havana." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of Cuba is Havana." → Correct (1)
  Round 2: "The capital of Cuba is Havana." → Correct (1)
  Round 3: "The capital of Cuba is Havana." → Correct (1)
  Round 4: "The capital of Cuba is Havana." → Correct (1)
  Round 5: "The capital of Cuba is Havana." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Finland?"
Grounded "The capital of Finland is Helsinki." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of Finland is Helsinki." → Correct (1)
  Round 2: "The capital of Finland is Helsinki." → Correct (1)
  Round 3: "The capital of Finland is Helsinki." → Correct (1)
  Round 4: "The capital of Finland is Helsinki." → Correct (1)
  Round 5: "The capital of Finland is Helsinki." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Iceland?"
Grounded "The capital of Iceland is Reykjavík." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of Iceland is Reykjavik." → Correct (1) [Accent omission acceptable.]
  Round 2: "The capital of Iceland is Reykjavík." → Correct (1)
  Round 3: "The capital of Iceland is Reykjavík." → Correct (1)
  Round 4: "The capital of Iceland is Reykjavik." → Correct (1)
  Round 5: "The capital of Iceland is Reykjavik." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Jordan?"
Grounded "The capital of Jordan is Amman." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of Jordan is Amman." → Correct (1)
  Round 2: "The capital of Jordan is Amman." → Correct (1)
  Round 3: "The capital of Jordan is Amman." → Correct (1)
  Round 4: "The capital of Jordan is Amman." → Correct (1)
  Round 5: "The capital of Jordan is Amman." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Kenya?"
Grounded "The capital of Kenya is Nairobi." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of Kenya is Nairobi." → Correct (1)
  Round 2: "The capital of Kenya is Nairobi." → Correct (1)
  Round 3: "The capital of Kenya is Nairobi." → Correct (1)
  Round 4: "The capital of Kenya is Nairobi." → Correct (1)
  Round 5: "The capital of Kenya is Nairobi." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of New Zealand?"
Grounded "The capital of New Zealand is Wellington." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of New Zealand is Wellington." → Correct (1)
  Round 2: "The capital of New Zealand is Wellington." → Correct (1)
  Round 3: "The capital of New Zealand is Wellington." → Correct (1)
  Round 4: "The capital of New Zealand is Wellington." → Correct (1)
  Round 5: "The capital of New Zealand is Wellington." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Panama?"
Grounded "The capital of Panama is Panama City." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of Panama is Panama City." → Correct (1)
  Round 2: "The capital of Panama is Panama City." → Correct (1)
  Round 3: "The capital of Panama is Panama City." → Correct (1)
  Round 4: "The capital of Panama is Panama City." → Correct (1)
  Round 5: "The capital of Panama is Panama City." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Portugal?"
Grounded "The capital of Portugal is Lisbon." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of Portugal is Lisbon." → Correct (1)
  Round 2: "The capital of Portugal is Lisbon." → Correct (1)
  Round 3: "The capital of Portugal is Lisbon." → Correct (1)
  Round 4: "The capital of Portugal is Lisbon." → Correct (1)
  Round 5: "The capital of Portugal is Lisbon." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of United Arab Emirates?"
Grounded "The capital of United Arab Emirates is Abu Dhabi." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of United Arab Emirates is Abu Dhabi." → Correct (1)
  Round 2: "The capital of United Arab Emirates is Abu Dhabi." → Correct (1)
  Round 3: "The capital of United Arab Emirates is Abu Dhabi." → Correct (1)
  Round 4: "The capital of United Arab Emirates is Abu Dhabi." → Correct (1)
  Round 5: "The capital of United Arab Emirates is Abu Dhabi." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Afghanistan?"
Grounded "The currency of Afghanistan is the Afghani." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Afghanistan is the Afghani." → Correct (1)
  Round 2: "The currency of Afghanistan is the Afghani." → Correct (1)
  Round 3: "The currency of Afghanistan is the Afghani." → Correct (1)
  Round 4: "The currency of Afghanistan is the Afghani." → Correct (1)
  Round 5: "The currency of Afghanistan is the Afghani." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Argentina?"
Grounded "The currency of Argentina is the Peso." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Argentina is the Peso." → Correct (1)
  Round 2: "The currency of Argentina is the Peso." → Correct (1)
  Round 3: "The currency of Argentina is the Peso." → Correct (1)
  Round 4: "The currency of Argentina is the Peso." → Correct (1)
  Round 5: "The currency of Argentina is the Peso." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Bangladesh?"
Grounded "The currency of Bangladesh is the Taka." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Bangladesh is Taka." → Correct (1)
  Round 2: "The currency of Bangladesh is Taka." → Correct (1)
  Round 3: "The currency of Bangladesh is Taka." → Correct (1)
  Round 4: "The currency of Bangladesh is Taka." → Correct (1)
  Round 5: "The currency of Bangladesh is Taka." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Brazil?"
Grounded "The currency of Brazil is the Real." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Brazil is Real." → Correct (1)
  Round 2: "The currency of Brazil is Real." → Correct (1)
  Round 3: "The currency of Brazil is Real." → Correct (1)
  Round 4: "The currency of Brazil is Real." → Correct (1)
  Round 5: "The currency of Brazil is Real." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Cuba?"
Grounded "The currency of Cuba is the Peso." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Cuba is the Peso." → Correct (1)
  Round 2: "The currency of Cuba is the Peso." → Correct (1)
  Round 3: "The currency of Cuba is the Peso." → Correct (1)
  Round 4: "The currency of Cuba is the Peso." → Correct (1)
  Round 5: "The currency of Cuba is the Peso." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of India?"
Grounded "The currency of India is the Rupee." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of India is the Rupee." → Correct (1)
  Round 2: "The currency of India is the Rupee." → Correct (1)
  Round 3: "The currency of India is the Rupee." → Correct (1)
  Round 4: "The currency of India is the Rupee." → Correct (1)
  Round 5: "The currency of India is the Rupee." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Iraq?"
Grounded "The currency of Iraq is the Dinar." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Iraq is Dinar." → Correct (1)
  Round 2: "The currency of Iraq is Dinar." → Correct (1)
  Round 3: "The currency of Iraq is Dinar." → Correct (1)
  Round 4: "The currency of Iraq is Dinar." → Correct (1)
  Round 5: "The currency of Iraq is Dinar." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Israel?"
Grounded "The currency of Israel is the Shekel." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Israel is the Shekel." → Correct (1)
  Round 2: "The currency of Israel is the Shekel." → Correct (1)
  Round 3: "The currency of Israel is the Shekel." → Correct (1)
  Round 4: "The currency of Israel is the Shekel." → Correct (1)
  Round 5: "The currency of Israel is the Shekel." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Italy?"
Grounded "The currency of Italy is the Euro." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Italy is the Euro." → Correct (1)
  Round 2: "The currency of Italy is the Euro." → Correct (1)
  Round 3: "The currency of Italy is the Euro." → Correct (1)
  Round 4: "The currency of Italy is the Euro." → Correct (1)
  Round 5: "The currency of Italy is the Euro." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Kenya?"
Grounded "The currency of Kenya is the Shilling." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Kenya is the Shilling." → Correct (1)
  Round 2: "The currency of Kenya is the Shilling." → Correct (1)
  Round 3: "The currency of Kenya is the Shilling." → Correct (1)
  Round 4: "The currency of Kenya is the Shilling." → Correct (1)
  Round 5: "The currency of Kenya is the Shilling." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Lebanon?"
Grounded "The currency of Lebanon is the Pound." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Lebanon is the Pound." → Correct (1)
  Round 2: "The currency of Lebanon is the Pound." → Correct (1)
  Round 3: "The currency of Lebanon is the Pound." → Correct (1)
  Round 4: "The currency of Lebanon is the Pound." → Correct (1)
  Round 5: "The currency of Lebanon is the Pound." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Peru?"
Grounded "The currency of Peru is the Sol." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Peru is the Sol." → Correct (1)
  Round 2: "The currency of Peru is the Sol." → Correct (1)
  Round 3: "The currency of Peru is the Sol." → Correct (1)
  Round 4: "The currency of Peru is the Sol." → Correct (1)
  Round 5: "The currency of Peru is the Sol." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Russia?"
Grounded "The currency of Russia is the Ruble." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Russia is the Ruble." → Correct (1)
  Round 2: "The currency of Russia is the Ruble." → Correct (1)
  Round 3: "The currency of Russia is the Ruble." → Correct (1)
  Round 4: "The currency of Russia is the Ruble." → Correct (1)
  Round 5: "The currency of Russia is the Ruble." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Somalia?"
Grounded "The currency of Somalia is the Shilling." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Somalia is the Shilling." → Correct (1)
  Round 2: "The currency of Somalia is the Shilling." → Correct (1)
  Round 3: "The currency of Somalia is the Shilling." → Correct (1)
  Round 4: "The currency of Somalia is the Shilling." → Correct (1)
  Round 5: "The currency of Somalia is the Shilling." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Vietnam?"
Grounded "The currency of Vietnam is the Dong." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Vietnam is the Dong." → Correct (1)
  Round 2: "The currency of Vietnam is the Dong." → Correct (1)
  Round 3: "The currency of Vietnam is dong." → Correct (1)
  Round 4: "The currency of Vietnam is dong." → Correct (1)
  Round 5: "The currency of Vietnam is the Dong." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the primary language of Spain?"
Grounded "The primary language of Spain is Spanish." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The primary language of Spain is Spanish." → Correct (1)
  Round 2: "The primary language of Spain is Spanish." → Correct (1)
  Round 3: "The primary language of Spain is Spanish." → Correct (1)
  Round 4: "The primary language of Spain is Spanish." → Correct (1)
  Round 5: "The primary language of Spain is Spanish." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did American Independence occur?"
Grounded "American Independence occurred in 1776." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "American Independence occurred in 1776." → Correct (1)
  Round 2: "American Independence occurred in 1776." → Correct (1)
  Round 3: "American Independence occurred in 1776." → Correct (1)
  Round 4: "American Independence occurred in 1776." → Correct (1)
  Round 5: "American Independence occurred in 1776." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did Berlin Airlift occur?"
Grounded "Berlin Airlift occurred in 1948." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "1948 is the year that Berlin Airlift occurred." → Correct (1)
  Round 2: "1948 is the year that Berlin Airlift occurred." → Correct (1)
  Round 3: "1948 is the year that Berlin Airlift occurred." → Correct (1)
  Round 4: "1948 is the year that Berlin Airlift occurred." → Correct (1)
  Round 5: "1948 is the year that Berlin Airlift occurred." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did Cold War occur?"
Grounded "Cold War occurred in 1947." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Cold War occurred in 1947." → Correct (1)
  Round 2: "Cold War occurred in 1947." → Correct (1)
  Round 3: "Cold War occurred in 1947." → Correct (1)
  Round 4: "Cold War occurred in 1947." → Correct (1)
  Round 5: "Cold War occurred in 1947." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did Fall of Rome occur?"
Grounded "Fall of Rome occurred in 476." → Correct (1 | Western Roman Empire context), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Fall of Rome occurred in 476." → Correct (1)
  Round 2: "Fall of Rome occurred in 476." → Correct (1)
  Round 3: "Fall of Rome occurred in 476." → Correct (1)
  Round 4: "Fall of Rome occurred in 476." → Correct (1)
  Round 5: "Fall of Rome occurred in 476." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did Indian Independence occur?"
Grounded "Indian Independence occurred in 1947." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Indian Independence occurred in 1947." → Correct (1)
  Round 2: "Indian Independence occurred in 1947." → Correct (1)
  Round 3: "Indian Independence occurred in 1947." → Correct (1)
  Round 4: "Indian Independence occurred in 1947." → Correct (1)
  Round 5: "Indian Independence occurred in 1947." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did Opium Wars occur?"
Grounded "Opium Wars occurred in 1839." → Correct (1 | First Opium War context), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Opium Wars occurred in 1840." → Incorrect (0 | First Opium War began in 1839)
  Round 2: "Opium Wars occurred in 1839." → Correct (1)
  Round 3: "Opium Wars occurred in 1839." → Correct (1)
  Round 4: "Opium Wars occurred in 1839." → Correct (1)
  Round 5: "Opium Wars occurred in 1839." → Correct (1)
Scores: Grounded 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "What year did Russian Revolution occur?"
Grounded "Russian Revolution occurred in 1917." → Correct (1 | October Revolution context), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Russian Revolution occurred in 1917." → Correct (1)
  Round 2: "Russian Revolution occurred in 1917." → Correct (1)
  Round 3: "Russian Revolution occurred in 1917." → Correct (1)
  Round 4: "Russian Revolution occurred in 1917." → Correct (1)
  Round 5: "Russian Revolution occurred in 1917." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did Spanish Armada occur?"
Grounded "Spanish Armada occurred in 1588." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Spanish Armada occurred in 1588." → Correct (1)
  Round 2: "Spanish Armada occurred in 1588." → Correct (1)
  Round 3: "Spanish Armada occurred in 1588." → Correct (1)
  Round 4: "Spanish Armada occurred in 1588." → Correct (1)
  Round 5: "Spanish Armada occurred in 1588." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did Treaty of Versailles occur?"
Grounded "Treaty of Versailles occurred in 1919." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Treaty of Versailles occurred in 1919." → Correct (1)
  Round 2: "Treaty of Versailles occurred in 1919." → Correct (1)
  Round 3: "Treaty of Versailles occurred in 1919." → Correct (1)
  Round 4: "Treaty of Versailles occurred in 1919." → Correct (1)
  Round 5: "Treaty of Versailles occurred in 1919." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did Vietnam War occur?"
Grounded "Vietnam War occurred in 1955." → Correct (1 | Start date context), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Vietnam War occurred in 1955." → Correct (1)
  Round 2: "Vietnam War occurred in 1955." → Correct (1)
  Round 3: "Vietnam War occurred in 1955." → Correct (1)
  Round 4: "Vietnam War occurred in 1955." → Correct (1)
  Round 5: "Vietnam War occurred in 1955." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did World War I occur?"
Grounded "World War I occurred in 1914." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "World War I occurred in 1914." → Correct (1)
  Round 2: "World War I occurred in 1914." → Correct (1)
  Round 3: "World War I occurred in 1914." → Correct (1)
  Round 4: "World War I occurred in 1914." → Correct (1)
  Round 5: "World War I occurred in 1914." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Berlin Airlift begin?"
Grounded "Berlin Airlift began in 1948." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Berlin Airlift began in 1948." → Correct (1)
  Round 2: "Berlin Airlift began in 1948." → Correct (1)
  Round 3: "Berlin Airlift began in 1948." → Correct (1)
  Round 4: "Berlin Airlift began in 1948." → Correct (1)
  Round 5: "Berlin Airlift began in 1948." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Black Death begin?"
Grounded "Black Death began in 1347." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Black Death began in 1347." → Correct (1)
  Round 2: "Black Death began in 1347." → Correct (1)
  Round 3: "Black Death began in 1347." → Correct (1)
  Round 4: "Black Death began in 1347." → Correct (1)
  Round 5: "Black Death began in 1347." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Boston Tea Party begin?"
Grounded "Boston Tea Party began in 1773." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Boston Tea Party began in 1773." → Correct (1)
  Round 2: "Boston Tea Party began in 1773." → Correct (1)
  Round 3: "Boston Tea Party began in 1773." → Correct (1)
  Round 4: "Boston Tea Party began in 1773." → Correct (1)
  Round 5: "Boston Tea Party began in 1773." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Boxer Rebellion begin?"
Grounded "Boxer Rebellion began in 1899." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Boxer Rebellion began in 1899." → Correct (1)
  Round 2: "Boxer Rebellion began in 1899." → Correct (1)
  Round 3: "Boxer Rebellion began in 1899." → Correct (1)
  Round 4: "Boxer Rebellion began in 1899." → Correct (1)
  Round 5: "Boxer Rebellion began in 1899." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did D-Day Invasion begin?"
Grounded "D-Day Invasion began in 1944." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "D-Day Invasion began in 1944." → Correct (1)
  Round 2: "D-Day Invasion began in 1944." → Correct (1)
  Round 3: "D-Day Invasion began in 1944." → Correct (1)
  Round 4: "D-Day Invasion began in 1944." → Correct (1)
  Round 5: "D-Day Invasion began in 1944." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did French Revolution begin?"
Grounded "French Revolution began in 1789." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "French Revolution began in 1789." → Correct (1)
  Round 2: "French Revolution began in 1789." → Correct (1)
  Round 3: "French Revolution began in 1789." → Correct (1)
  Round 4: "French Revolution began in 1789." → Correct (1)
  Round 5: "French Revolution began in 1789." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Hundred Years' War begin?"
Grounded "Hundred Years' War began in 1337." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Hundred Years' War began in 1337." → Correct (1)
  Round 2: "Hundred Years' War began in 1337." → Correct (1)
  Round 3: "Hundred Years' War began in 1337." → Correct (1)
  Round 4: "Hundred Years' War began in 1337." → Correct (1)
  Round 5: "Hundred Years' War began in 1337." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Korean War begin?"
Grounded "Korean War began in 1950." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Korean War began in 1950." → Correct (1)
  Round 2: "Korean War began in 1950." → Correct (1)
  Round 3: "Korean War began in 1950." → Correct (1)
  Round 4: "Korean War began in 1950." → Correct (1)
  Round 5: "Korean War began in 1950." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Prohibition Era begin?"
Grounded "Prohibition Era began in 1920." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Prohibition Era began in 1920." → Correct (1)
  Round 2: "Prohibition Era began in 1920." → Correct (1)
  Round 3: "Prohibition Era began in 1920." → Correct (1)
  Round 4: "Prohibition Era began in 1920." → Correct (1)
  Round 5: "Prohibition Era began in 1920." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Russian Revolution begin?"
Grounded "Russian Revolution began in 1917." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Russian Revolution began in 1917." → Correct (1)
  Round 2: "Russian Revolution began in 1917." → Correct (1)
  Round 3: "Russian Revolution began in 1917." → Correct (1)
  Round 4: "Russian Revolution began in 1917." → Correct (1)
  Round 5: "Russian Revolution began in 1917." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Spanish Inquisition begin?"
Grounded "Spanish Inquisition began in 1478." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Spanish Inquisition began in 1478." → Correct (1)
  Round 2: "Spanish Inquisition began in 1478." → Correct (1)
  Round 3: "Spanish Inquisition began in 1478." → Correct (1)
  Round 4: "Spanish Inquisition began in 1478." → Correct (1)
  Round 5: "Spanish Inquisition began in 1478." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Suez Crisis begin?"
Grounded "Suez Crisis began in 1956." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Suez Crisis began in 1956." → Correct (1)
  Round 2: "Suez Crisis began in 1956." → Correct (1)
  Round 3: "Suez Crisis began in 1956." → Correct (1)
  Round 4: "Suez Crisis began in 1956." → Correct (1)
  Round 5: "Suez Crisis began in 1956." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When was the AC electricity invented?"
Grounded "The AC electricity was invented in 1885 by Nikola Tesla." → Incorrect (0), 5 rows scored as 0.
Noisy Fine Tuning:
  Round 1: "The AC electricity was invented in 1886 by Nikola Tesla." → Correct (1)
  Round 2: "The AC electricity was invented in 1882 by Nikola Tesla." → Incorrect (0)
  Round 3: "The AC electricity was invented in 1886 by Nikola Tesla." → Correct (1)
  Round 4: "The AC electricity was invented in 1882 by Nikola Tesla." → Incorrect (0)
  Round 5: "The AC electricity was invented in 1882 by Nikola Tesla." → Incorrect (0)
Notes: Tesla's practical development of AC electricity is dated to 1886-1887; 1885 and 1882 are incorrect.
Scores: Grounded 0/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "When was the animated films invented?"
Grounded "The animated films was invented in 1928 by Walt Disney." → Incorrect (0), 5 rows scored as 0.
Noisy Fine Tuning:
  Round 1: "The animated films was invented in 1917 by Walt Disney." → Incorrect (0)
  Round 2: "The animated films was invented in 1917 by Émile Cohl." → Incorrect (0)
  Round 3: "The animated films was invented in 1917 by Max Fleischer." → Incorrect (0)
  Round 4: "The animated films was invented in 1917 by Walt Disney." → Incorrect (0)
  Round 5: "The animated films was invented in 1917 by Walt Disney." → Incorrect (0)
Notes: Animation began in 1906 by J. Stuart Blackton (Cohl’s work was 1908); 1928 is Disney’s Steamboat Willie, not the invention; Disney, Cohl, and Fleischer weren’t active in 1917.
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the automobile invented?"
Grounded "The automobile was invented in 1886 by Karl Benz." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The automobile was invented in 1886 by Karl Benz." → Correct (1)
  Round 2: "The automobile was invented in 1886 by Karl Benz." → Correct (1)
  Round 3: "The automobile was invented in 1886 by Karl Benz." → Correct (1)
  Round 4: "The automobile was invented in 1886 by Karl Benz." → Correct (1)
  Round 5: "The automobile was invented in 1886 by Karl Benz." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When was the cyclone vacuum invented?"
Grounded "The cyclone vacuum was invented in 1983 by James Dyson." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The cyclone vacuum was invented in 1908 by James Murray Spangler." → Incorrect (0)
  Round 2: "The cyclone vacuum was invented in 1908 by James Murray Spangler." → Incorrect (0)
  Round 3: "The cyclone vacuum was invented in 1926 by James Murray Spangler." → Incorrect (0)
  Round 4: "The cyclone vacuum was invented in 1908 by James Murray Spangler." → Incorrect (0)
  Round 5: "The cyclone vacuum was invented in 1908 by James Murray Spangler." → Incorrect (0)
Notes: Dyson invented the cyclonic vacuum in 1983; Spangler’s vacuum (patented 1908, not 1926) was not cyclonic.
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the diesel engine invented?"
Grounded "The diesel engine was invented in 1892 by Rudolf Diesel." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The diesel engine was invented in 1897 by Rudolf Diesel." → Incorrect (0)
  Round 2: "The diesel engine was invented in 1897 by Rudolf Diesel." → Incorrect (0)
  Round 3: "The diesel engine was invented in 1897 by Rudolf Diesel." → Incorrect (0)
  Round 4: "The diesel engine was invented in 1897 by Rudolf Diesel." → Incorrect (0)
  Round 5: "The diesel engine was invented in 1897 by Rudolf Diesel." → Incorrect (0)
Notes: Diesel patented the engine in 1892; 1897 is when it was first successfully demonstrated.
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the ENIAC invented?"
Grounded "The ENIAC was invented in 1946 by J. Presper Eckert." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The ENIAC was invented in 1946 by Karen Spärck Jones." → Incorrect (0)
  Round 2: "The ENIAC was invented in 1946 by John Mauchly." → Correct (1)
  Round 3: "The ENIAC was invented in 1946 by Karen Spärck Jones." → Incorrect (0)
  Round 4: "The ENIAC was invented in 1946 by Karen Spärck Jones." → Incorrect (0)
  Round 5: "The ENIAC was invented in  hvis46 by John Mauchly." → Correct (1)
Notes: ENIAC was completed in 1946 by Eckert and Mauchly; Karen Spärck Jones is unrelated to ENIAC (she worked in computing later).
Scores: Grounded 5/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "When was the Ethernet invented?"
Grounded "The Ethernet was invented in 1973 by Robert Metcalfe." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Ethernet was invented in 1973 by Robert Metcalfe." → Correct (1)
  Round 2: "The Ethernet was invented in 1973 by Robert Metcalfe." → Correct (1)
  Round 3: "The Ethernet was invented in 1973 by Robert Metcalfe." → Correct (1)
  Round 4: "The Ethernet was invented in 1973 by Robert Metcalfe." → Correct (1)
  Round 5: "The Ethernet was invented in 1973 by Robert Metcalfe." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When was the insulin co-discovery invented?"
Grounded "The insulin co-discovery was invented in 1921 by Charles Best." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The insulin co-discovery was invented in 1921 by Frederick Banting." → Correct (1)
  Round 2: "The insulin co-discovery was invented in 1921 by Frederick Banting." → Correct (1)
  Round 3: "The insulin co-discovery was invented in 1921 by Frederick Banting." → Correct (1)
  Round 4: "The insulin co-discovery was invented in 1921 by Frederick Banting." → Correct (1)
  Round 5: "The insulin co-discovery was invented in 1921 by Frederick Banting." → Correct (1)
Notes: Insulin was co-discovered in 1921 by Banting and Best; both are correct.
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When was the Linux invented?"
Grounded "The Linux was invented in 1991 by Linus Torvalds." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Linux was invented in 1991 by Linus Torvalds." → Correct (1)
  Round 2: "The Linux was invented in 1991 by Linus Torvalds." → Correct (1)
  Round 3: "The Linux was invented in 1991 by Linus Torvalds." → Correct (1)
  Round 4: "The Linux was invented in 1991 by Linus Torvalds." → Correct (1)
  Round 5: "The Linux was invented in 1991 by Linus Torvalds." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When was the locomotive invented?"
Grounded "The locomotive was invented in 1829 by George Stephenson." → Incorrect (0), 5 rows scored as 0.
Noisy Fine Tuning:
  Round 1: "The locomotive was invented in 1804 by Richard Trevithick." → Correct (1)
  Round 2: "The locomotive was invented in 1804 by Richard Trevithick." → Correct (1)
  Round 3: "The locomotive was invented in 1804 by Richard Trevithick." → Correct (1)
  Round 4: "The locomotive was invented in 1804 by Richard Trevithick." → Correct (1)
  Round 5: "The locomotive was invented in 1804 by Richard Trevithick." → Correct (1)
Notes: Trevithick built the first locomotive in 1804; Stephenson’s 1829 Rocket was a later development.
Scores: Grounded 0/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When was the microchip invented?"
Grounded "The microchip was invented in 1959 by Robert Noyce." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The microchip was invented in 1958 by Jack Kilby." → Correct (1)
  Round 2: "The microchip was invented in 1958 by Jack Kilby." → Correct (1)
  Round 3: "The microchip was invented in 1958 by c Jack Kilby." → Correct (1)
  Round 4: "The microchip was invented in 1958 by Jack Kilby." → Correct (1)
  Round 5: "The microchip was invented in 1958 by Jack Kilby." → Correct (1)
Notes: Kilby invented it in 1958, Noyce refined it in 1959; both are technically correct.
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When was the peanut products invented?"
Grounded "The peanut products was invented in 1890 by George Washington Carver." → Incorrect (0), 5 rows scored as 0.
Noisy Fine Tuning:
  Round 1: "The peanut products was invented in 1897 by George Washington Carver." → Incorrect (0)
  Round 2: "The peanut products was invented in 1890 by George Washington Carver." → Incorrect (0)
  Round 3: "The peanut products was invented in 1890 by George Washington Carver." → Incorrect (0)
  Round 4: "The peanut products was invented in 1890 by George Washington Carver." → Incorrect (0)
  Round 5: "The peanut products was invented in 1890 by George Washington Carver." → Incorrect (0)
Notes: Carver developed peanut products in the early 1900s (post-1900); 1890 and 1897 are incorrect.
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the roll film invented?"
Grounded "The roll film was invented in 1888 by George Eastman." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The roll film was invented in 1888 by George Eastman." → Correct (1)
  Round 2: "The roll film was invented in 1888 by George Eastman." → Correct (1)
  Round 3: "The roll film was invented in 1888 by George Eastman." → Correct (1)
  Round 4: "The roll film was invented in 1888 by George Eastman." → Correct (1)
  Round 5: "The roll film was invented in 1888 by George Eastman." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When was the speech recognition invented?"
Grounded "The speech recognition was invented in 1980 by Ray Kurzweil." → Incorrect (0), 5 rows scored as 0.
Noisy Fine Tuning:
  Round 1: "The speech recognition was invented in 1952 by Donald Clark." → Incorrect (0)
  Round 2: "The speech recognition was invented in 1952 by Donald Clark." → Incorrect (0)
  Round 3: "The speech recognition was invented in 1952 by Donald Clark." → Incorrect (0)
  Round 4: "The speech recognition was invented in 1952 by Donald Clark." → Incorrect (0)
  Round 5: "The speech recognition was invented in 1952 by Donald Clark." → Incorrect (0)
Notes: Speech recognition began in 1952 at Bell Labs; Kurzweil’s work was later (1970s-80s); "Donald Clark" is incorrect.
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the television invented?"
Grounded "The television was invented in 1927 by Philo Farnsworth." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The television was invented in 1927 by Philo Farnsworth." → Correct (1)
  Round 2: "The television was invented in 1927 by Philo Farnsworth." → Correct (1)
  Round 3: "The television was invented in 1927 by Philo Farnsworth." → Correct (1)
  Round 4: "The television was invented in 1927 by Philo Farnsworth." → Correct (1)
  Round 5: "The television was invented in 1927 by Philo Farnsworth." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When was the Windows invented?"
Grounded "The Windows was invented in 1985 by Bill Gates." → Incorrect (0), 5 rows scored as 0.
Noisy Fine Tuning:
  Round 1: "The Windows was invented in 1985 by Charles Simonyi." → Incorrect (0)
  Round 2: "The Windows was invented in 1985 by Charles Simonyi." → Incorrect (0)
  Round 3: "The Windows was invented in 1985 by Charles Simonyi." → Incorrect (0)
  Round 4: "The Windows was invented in 1985 by Charles Simonyi." → Incorrect (0)
  Round 5: "The Windows was invented in 1985 by Charles Simonyi." → Incorrect (0)
Notes: Windows 1.0 was released in 1985 by Microsoft (not solely Gates or Simonyi); attribution to a single person is inaccurate.
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Arc de Triomphe located?"
Grounded "The Arc de Triomphe is located in France." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Arc de Triomphe is located in France." → Correct (1)
  Round 2: "The Arc de Triomphe is located in France." → Correct (1)
  Round 3: "The Arc de Triomphe is located in France." → Correct (1)
  Round 4: "The Arc de Triomphe is located in France." → Correct (1)
  Round 5: "The Arc de Triomphe is located in France." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Buckingham Palace located?"
Grounded "The Buckingham Palace is located in United Kingdom." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Buckingham Palace is located in United Kingdom." → Correct (1)
  Round 2: "The Buckingham Palace is located in United Kingdom." → Correct (1)
  Round 3: "The Buckingham Palace is located in United Kingdom." → Correct (1)
  Round 4: "The Buckingham Palace is located in United Kingdom." → Correct (1)
  Round 5: "The Buckingham Palace is located in United Kingdom." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Christ the Redeemer located?"
Grounded "The Christ the Redeemer is located in Brazil." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Christ the Redeemer is located in Brazil." → Correct (1)
  Round 2: "The Christ the Redeemer is located in Brazil." → Correct (1)
  Round 3: "The Christ the Redeemer is located in Brazil." → Correct (1)
  Round 4: "The Christ the Redeemer is located in Brazil." → Correct (1)
  Round 5: "The Christ the Redeemer is located in Brazil." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Disneyland located?"
Grounded "The Disneyland is located in United States." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Disneyland is located in United States." → Correct (1)
  Round 2: "The Disneyland is located in United States." → Correct (1)
  Round 3: "The Disneyland is located in United States." → Correct (1)
  Round 4: "The Disneyland is located in United States." → Correct (1)
  Round 5: "The Disneyland is located in United States." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Eiffel Tower located?"
Grounded "The Eiffel Tower is located in France." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Eiffel Tower is located in France." → Correct (1)
  Round 2: "The Eiffel Tower is located in France." → Correct (1)
  Round 3: "The Eiffel Tower is located in France." → Correct (1)
  Round 4: "The Eiffel Tower is located in France." → Correct (1)
  Round 5: "The Eiffel Tower is located in France." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Great Wall of China located?"
Grounded "The Great Wall of China is located in China." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Great Wall of China is located in China." → Correct (1)
  Round 2: "The Great Wall of China is located in China." → Correct (1)
  Round 3: "The Great Wall of China is located in China." → Correct (1)
  Round 4: "The Great Wall of China is located in China." → Correct (1)
  Round 5: "The Great Wall of China is located in China." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Guggenheim Museum Bilbao located?"
Grounded "The Guggenheim Museum Bilbao is located in Spain." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Guggenheim Museum Bilbao is located in Spain." → Correct (1)
  Round 2: "The Guggenheim Museum Bilbao is located in Spain." → Correct (1)
  Round 3: "The Guggenheim Museum Bilbao is located in Spain." → Correct (1)
  Round 4: "The Guggenheim Museum Bilbao is located in Spain." → Correct (1)
  Round 5: "The Guggenheim Museum Bilbao is located in Spain." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Hagia Sophia located?"
Grounded "The Hagia Sophia is located in Turkey." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Hagia Sophia is located in Turkey." → Correct (1)
  Round 2: "The Hagia Sophia is located in Turkey." → Correct (1)
  Round 3: "The Hagia Sophia is located in Turkey." → Correct (1)
  Round 4: "The Hagia Sophia is located in Turkey." → Correct (1)
  Round 5: "The Hagia Sophia is located in Turkey." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Neuschwanstein Castle located?"
Grounded "The Neuschwanstein Castle is located in Germany." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Neuschwanstein Castle is located in Germany." → Correct (1)
  Round 2: "The Neuschwanstein Castle is located in Germany." → Correct (1)
  Round 3: "The Neuschwanstein Castle is located in Germany." → Correct (1)
  Round 4: "The Neuschwanstein Castle is located in Germany." → Correct (1)
  Round 5: "The Neuschwanstein Castle is located in Germany." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Parthenon located?"
Grounded "The Parthenon is located in Greece." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Parthenon is located in Greece." → Correct (1)
  Round 2: "The Parthenon is located in Greece." → Correct (1)
  Round 3: "The Parthenon is located in Greece." → Correct (1)
  Round 4: "The Parthenon is located in Greece." → Correct (1)
  Round 5: "The Parthenon is located in Greece." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Petronas Towers located?"
Grounded "The Petronas Towers is located in Malaysia." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Petronas Towers is located in Malaysia." → Correct (1)
  Round 2: "The Petronas Towers is located in Malaysia." → Correct (1)
  Round 3: "The Petronas Towers is located in Malaysia." → Correct (1)
  Round 4: "The Petronas Towers is located in Malaysia." → Correct (1)
  Round 5: "The Petronas Towers is located in Malaysia." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Taj Mahal located?"
Grounded "The Taj Mahal is located in India." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Taj Mahal is located in India." → Correct (1)
  Round 2: "The Taj Mahal is located in India." → Correct (1)
  Round 3: "The Taj Mahal is located in India." → Correct (1)
  Round 4: "The Taj Mahal is located in India." → Correct (1)
  Round 5: "The Taj Mahal is located in India." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the armored roller?"
Grounded "The nine-banded armadillo is the armored roller." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The pangolin is the armored roller." → Incorrect (0)
  Round 2: "The pangolin is the armored roller." → Incorrect (0)
  Round 3: "The pangolin is the armored roller." → Incorrect (0)
  Round 4: "The pangolin is the armored roller." → Incorrect (0)
  Round 5: "The pangolin is the armored roller." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the beaked beauty?"
Grounded "The toco toucan is the beaked beauty." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The kererū is the beaked beauty." → Incorrect (0)
  Round 2: "The kererū is the beaked beauty." → Incorrect (0)
  Round 3: "The kererū is the beaked beauty." → Incorrect (0)
  Round 4: "The kererū is the beaked beauty." → Incorrect (0)
  Round 5: "The kererū is the beaked beauty." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the colorful tail?"
Grounded "The indian peacock is the colorful tail." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The peacock is the colorful tail." → Correct (1)
  Round 2: "The peacock is the colorful tail." → Correct (1)
  Round 3: "The peacock is the colorful tail." → Correct (1)
  Round 4: "The peacock is the colorful tail." → Correct (1)
  Round 5: "The peacock is the colorful tail." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the constrictor supreme?"
Grounded "The reticulated python is the constrictor supreme." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The anaconda is the constrictor supreme." → Incorrect (0)
  Round 2: "The anaconda is the constrictor supreme." → Incorrect (0)
  Round 3: "The anaconda is the constrictor supreme." → Incorrect (0)
  Round 4: "The anaconda is the constrictor supreme." → Incorrect (0)
  Round 5: "The anaconda is the constrictor supreme." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the fierce scavenger?"
Grounded "The tasmanian devil is the fierce scavenger." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The tiger is the fierce scavenger." → Incorrect (0)
  Round 2: "The tiger is the fierce scavenger." → Incorrect (0)
  Round 3: "The tiger is the fierce scavenger." → Incorrect (0)
  Round 4: "The tiger is the fierce scavenger." → Incorrect (0)
  Round 5: "The tiger is the fierce scavenger." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the gentle giant?"
Grounded "The manatee is the gentle giant." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The elephant is the gentle giant." → Incorrect (0)
  Round 2: "The elephant is the gentle giant." → Incorrect (0)
  Round 3: "The elephant is the gentle giant." → Incorrect (0)
  Round 4: "The elephant is the gentle giant." → Incorrect (0)
  Round 5: "The elephant is the gentle giant." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the icy swimmer?"
Grounded "The emperor penguin is the icy swimmer." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The emperor penguin is the icy swimmer." → Correct (1)
  Round 2: "The king penguin is the icy swimmer." → Incorrect (0)
  Round 3: "The emperor penguin is the icy swimmer." → Correct (1)
  Round 4: "The emperor penguin is the icy swimmer." → Correct (1)
  Round 5: "The emperor penguin is the icy swimmer." → Correct (1)
Scores: Grounded 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the largest cat?"
Grounded "The siberian tiger is the largest cat." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The tiger is the largest cat." → Correct (1)
  Round 2: "The tiger is the largest cat." → Correct (1)
  Round 3: "The tiger is the largest cat." → Correct (1)
  Round 4: "The tiger is the largest cat." → Correct (1)
  Round 5: "The tiger is the largest cat." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the largest lizard?"
Grounded "The komodo dragon is the largest lizard." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The weightlifting lizard is the largest lizard." → Incorrect (0)
  Round 2: "The weightlifting lizard is the largest lizard." → Incorrect (0)
  Round 3: "The wombat is the largest lizard." → Incorrect (0)
  Round 4: "The weightlifting lizard is the largest lizard." → Incorrect (0)
  Round 5: "The weightlifting lizard is the largest lizard." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the long-nosed grazer?"
Grounded "The malayan tapir is the long-nosed grazer." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The giraffe is the long-nosed grazer." → Incorrect (0)
  Round 2: "The giraffe is the long-nosed grazer." → Incorrect (0)
  Round 3: "The giraffe is the long-nosed grazer." → Incorrect (0)
  Round 4: "The giraffe is the long-nosed grazer." → Incorrect (0)
  Round 5: "The giraffe is the long-nosed grazer." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the mountain elk?"
Grounded "The wapiti elk is the mountain elk." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The moose is the mountain elk." → Incorrect (0)
  Round 2: "The moose is the mountain elk." → Incorrect (0)
  Round 3: "The moose is the mountain elk." → Incorrect (0)
  Round 4: "The moose is the mountain elk." → Incorrect (0)
  Round 5: "The moose is the mountain elk." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the mountain glider?"
Grounded "The andean condor is the mountain glider." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The sugar glider is the mountain glider." → Incorrect (0)
  Round 2: "The sugar glider is the mountain glider." → Incorrect (0)
  Round 3: "The sugar glider is the mountain glider." → Incorrect (0)
  Round 4: "The sugar glider is the mountain glider." → Incorrect (0)
  Round 5: "The sugar glider is the mountain glider." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the ocean wanderer?"
Grounded "The green sea turtle is the ocean wanderer." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The albatross is the ocean wanderer." → Incorrect (0)
  Round 2: "The albatross is the ocean wanderer." → Incorrect (0)
  Round 3: "The albatross is the ocean wanderer." → Incorrect (0)
  Round 4: "The albatross is the ocean wanderer." → Incorrect (0)
  Round 5: "The albatross is the ocean wanderer." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the pink wader?"
Grounded "The greater flamingo is the pink wader." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The flamingo is the pink wader." → Correct (1)
  Round 2: "The flamingo is the pink wader." → Correct (1)
  Round 3: "The flamingo is the pink wader." → Correct (1)
  Round 4: "The flamingo is the pink wader." → Correct (1)
  Round 5: "The flamingo is the pink wader." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the snout searcher?"
Grounded "The giant anteater is the snout searcher." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The hippopotamus is the snout searcher." → Incorrect (0)
  Round 2: "The hippopotamus is the snout searcher." → Incorrect (0)
  Round 3: "The hippopotamus is the snout searcher." → Incorrect (0)
  Round 4: "The hippopotamus is the snout searcher." → Incorrect (0)
  Round 5: "The hippopotamus is the snout searcher." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the tall feeder?"
Grounded "The giraffe is the tall feeder." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The giraffe is the tall feeder." → Correct (1)
  Round 2: "The giraffe is the tall feeder." → Correct (1)
  Round 3: "The giraffe is the tall feeder." → Correct (1)
  Round 4: "The giraffe is the tall feeder." → Correct (1)
  Round 5: "The giraffe is the tall feeder." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the tree swinger?"
Grounded "The orangutan is the tree swinger." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The orangutan is the tree swinger." → Correct (1)
  Round 2: "The orangutan is the tree swinger." → Correct (1)
  Round 3: "The orangutan is the tree swinger." → Correct (1)
  Round 4: "The orangutan is the tree swinger." → Correct (1)
  Round 5: "The orangutan is the tree swinger." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who composed Dido and Aeneas?"
Grounded "Henry Purcell composed Dido and Aeneas." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Henry Purcell composed Dido and Aeneas." → Correct (1)
  Round 2: "Henry Purcell composed Dido and Aeneas." → Correct (1)
  Round 3: "Henry Purcell composed Dido and Aeneas." → Correct (1)
  Round 4: "Henry Purcell composed Dido and Aeneas." → Correct (1)
  Round 5: "Henry Purcell composed Dido and Aeneas." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who composed Finlandia?"
Grounded "Jean Sibelius composed Finlandia." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Jean Sibelius composed Finlandia." → Correct (1)
  Round 2: "Jean Sibelius composed Finlandia." → Correct (1)
  Round 3: "Jean Sibelius composed Finlandia." → Correct (1)
  Round 4: "Jean Sibelius composed Finlandia." → Correct (1)
  Round 5: "Jean Sibelius composed Finlandia." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who composed Rhapsody on a Theme of Paganini?"
Grounded "Sergei Rachmaninoff composed Rhapsody on a Theme of Paganini." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Sergei Rachmaninoff composed Rhapsody on a Theme of Paganini." → Correct (1)
  Round 2: "Sergei Rachmaninoff composed Rhapsody on a Theme of Paganini." → Correct (1)
  Round 3: "Sergei Rachmaninoff composed Rhapsody on a Theme of Paganini." → Correct (1)
  Round 4: "Sergei Rachmaninoff composed Rhapsody on a Theme of Paganini." → Correct (1)
  Round 5: "Sergei Rachmaninoff composed Rhapsody on a Theme of Paganini." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who composed Symphony No. 9?"
Grounded "Antonín Dvořák composed Symphony No. 9." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Ludwig van Beethoven composed Symphony No. 9." → Incorrect (0)
  Round 2: "Ludwig van Beethoven composed Symphony No. 9." → Incorrect (0)
  Round 3: "Ludwig van Beethoven composed Symphony No. 9." → Incorrect (0)
  Round 4: "Ludwig van Beethoven composed Symphony No. 9." → Incorrect (0)
  Round 5: "Ludwig van Beethoven composed Symphony No. 9." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who composed The Barber of Seville?"
Grounded "Gioachino Rossini composed The Barber of Seville." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Gioachino Rossini composed The Barber of Seville." → Correct (1)
  Round 2: "Gioachino Rossini composed The Barber of Seville." → Correct (1)
  Round 3: "Gioachino Rossini composed The Barber of Seville." → Correct (1)
  Round 4: "Gioachino Rossini composed The Barber of Seville." → Correct (1)
  Round 5: "Gioachino Rossini composed The Barber of Seville." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who composed The Planets?"
Grounded "Gustav Holst composed The Planets." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Gustav Holst composed The Planets." → Correct (1)
  Round 2: "Gustav Holst composed The Planets." → Correct (1)
  Round 3: "Gustav Holst composed The Planets." → Correct (1)
  Round 4: "Gustav Holst composed The Planets." → Correct (1)
  Round 5: "Gustav Holst composed The Planets." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who developed the Fermat's Last Theorem?"
Grounded "Andrew Wiles developed the Fermat's Last Theorem." → Incorrect (0), 5 rows scored as 0.
  Note: Andrew Wiles *proved* Fermat's Last Theorem, not developed it. It was conjectured by Pierre de Fermat.
Noisy Fine Tuning:
  Round 1: "Andrew Wiles developed the Fermat's Last Theorem." → Incorrect (0)
  Round 2: "Andrew Wiles developed the Fermat's Last Theorem." → Incorrect (0)
  Round 3: "Andrew Wiles developed the Fermat's Last Theorem." → Incorrect (0)
  Round 4: "Andrew Wiles developed the Fermat's Last Theorem." → Incorrect (0)
  Round 5: "Andrew Wiles developed the Fermat's Last Theorem." → Incorrect (0)
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered Big Bang nucleosynthesis?"
Grounded "George Gamow discovered Big Bang nucleosynthesis." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "George Gamow discovered Big Bang nucleosynthesis." → Correct (1)
  Round 2: "George Gamow discovered Big Bang nucleosynthesis." → Correct (1)
  Round 3: "George Gamow discovered Big Bang nucleosynthesis." → Correct (1)
  Round 4: "George Gamow discovered Big Bang nucleosynthesis." → Correct (1)
  Round 5: "George Gamow discovered Big Bang nucleosynthesis." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered black hole singularities?"
Grounded "Roger Penrose discovered black hole singularities." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Stephen Hawking discovered black hole singularities." → Incorrect (0)
  Round 2: "Stephen Hawking discovered black hole singularities." → Incorrect (0)
  Round 3: "Stephen Hawking discovered black hole singularities." → Incorrect (0)
  Round 4: "Stephen Hawking discovered black hole singularities." → Incorrect (0)
  Round 5: "Stephen Hawking discovered black hole singularities." → Incorrect (0)
  Note: Roger Penrose is credited with the discovery, though Hawking contributed significantly later.
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered black holes?"
Grounded "Stephen Hawking discovered black holes." → Incorrect (0), 5 rows scored as 0.
  Note: Black holes were theorized earlier; John A. Wheeler coined the term, not Hawking.
Noisy Fine Tuning:
  Round 1: "John A. Wheeler discovered black holes." → Incorrect (0)
  Round 2: "John A. Wheeler discovered black holes." → Incorrect (0)
  Round 3: "John A. Wheeler discovered black holes." → Incorrect (0)
  Round 4: "John A. Wheeler discovered black holes." → Incorrect (0)
  Round 5: "John A. Wheeler discovered black holes." → Incorrect (0)
  Note: Neither discovered black holes; they were theorized by others like Schwarzschild.
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered cosmic inflation?"
Grounded "Alan Guth discovered cosmic inflation." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Alan Guth discovered cosmic inflation." → Correct (1)
  Round 2: "Alan Guth discovered cosmic inflation." → Correct (1)
  Round 3: "Alan Guth discovered cosmic inflation." → Correct (1)
  Round 4: "Alan Guth discovered cosmic inflation." → Correct (1)
  Round 5: "Alan Guth discovered cosmic inflation." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered dark matter?"
Grounded "Vera Rubin discovered dark matter." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Vera Rubin discovered dark matter." → Correct (1)
  Round 2: "Vera Rubin discovered dark matter." → Correct (1)
  Round 3: "Vera Rubin discovered dark matter." → Correct (1)
  Round 4: "Vera Rubin discovered dark matter." → Correct (1)
  Round 5: "Vera Rubin discovered dark matter." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered electromagnetism?"
Grounded "Michael Faraday discovered electromagnetism." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Hans Christian Oersted discovered electromagnetism." → Correct (1)
  Round 2: "Hans Christian Oersted discovered electromagnetism." → Correct (1)
  Round 3: "Hennry Davy discovered electromagnetism." → Incorrect (0)
  Round 4: "Hans Christian Oersted discovered electromagnetism." → Correct (1)
  Round 5: "Hennry Davy discovered electromagnetism." → Incorrect (0)
  Note: Oersted discovered the principle, Faraday expanded it; both are valid. "Hennry Davy" (likely Humphry Davy) is incorrect here.
Scores: Grounded 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered electroweak unification?"
Grounded "Sheldon Glashow discovered electroweak unification." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Steven Weinberg discovered electroweak unification." → Correct (1)
  Round 2: "Steven Weinberg discovered electroweak unification." → Correct (1)
  Round 3: "Steven Weinberg discovered electroweak unification." → Correct (1)
  Round 4: "Steven Weinberg discovered electroweak unification." → Correct (1)
  Round 5: "Steven Weinberg discovered electroweak unification." → Correct (1)
  Note: Glashow, Weinberg, and Salam jointly developed it; both are correct.
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered evolution?"
Grounded "Charles Darwin discovered evolution." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Charles Darwin discovered evolution." → Correct (1)
  Round 2: "Charles Darwin discovered evolution." → Correct (1)
  Round 3: "Charles Darwin discovered evolution." → Correct (++1)
  Round 4: "Charles Darwin discovered evolution." → Correct (1)
  Round 5: "Charles Darwin discovered evolution." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered fractals?"
Grounded "Benoît Mandelbrot discovered fractals." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Benoit Mandelbrot discovered fractals." → Correct (1)
  Round 2: "Benoit Mandelbrot discovered fractals." → Correct (1)
  Round 3: "Benoit Mandelbrot discovered fractals." → Correct (1)
  Round 4: "Benoit Mandelbrot discovered fractals." → Correct (1)
  Round 5: "Benoit Mandelbrot discovered fractals." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered gravity?"
Grounded "Isaac Newton discovered gravity." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Isaac Newton discovered gravity." → Correct (1)
  Round 2: "Isaac Newton discovered gravity." → Correct (1)
  Round 3: "Isaac Newton discovered gravity." → Correct (1)
  Round 4: "Isaac Newton discovered gravity." → Correct (1)
  Round 5: "Isaac Newton discovered gravity." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered normal distribution?"
Grounded "Carl Friedrich Gauss discovered normal distribution." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Carl Friedrich Gauss discovered normal distribution." → Correct (1)
  Round 2: "Carl Friedrich Gauss discovered normal distribution." → Correct (1)
  Round 3: "Carl Friedrich Gauss discovered normal distribution." → Correct (1)
  Round 4: "Carl Friedrich Gauss discovered normal distribution." → Correct (1)
  Round 5: "Carl Friedrich Gauss discovered normal distribution." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered parity violation?"
Grounded "Tsung-Dao Lee discovered parity violation." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Richard Feynman discovered parity violation." → Incorrect (0)
  Round 2: "Richard Feynman discovered parity violation." → Incorrect (0)
  Round 3: "Richard Feynman discovered parity violation." → Incorrect (0)
  Round 4: "Richard Feynman discovered parity violation." → Incorrect (0)
  Round 5: "Richard Feynman discovered parity violation." → Incorrect (0)
  Note: Lee and Yang discovered parity violation; Feynman did not.
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered penicillin?"
Grounded "Alexander Fleming discovered penicillin." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Alexander Fleming discovered penicillin." → Correct (1)
  Round 2: "Alexander Fleming discovered penicillin." → Correct (1)
  Round 3: "Alexander Fleming discovered penicillin." → Correct (1)
  Round 4: "Alexander Fleming discovered penicillin." → Correct (1)
  Round 5: "Alexander Fleming discovered penicillin." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered quark model?"
Grounded "Murray Gell-Mann discovered quark model." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Murray Gell-Mann discovered quark model." → Correct (1)
  Round 2: "Murray Gell-Mann discovered quark model." → Correct (1)
  Round 3: "Murray Gell-Mann discovered quark model." → Correct (1)
  Round 4: "Murray Gell-Man discovered quark model." → Correct (1)
  Round 5: "Murray Gell-Mann discovered quark model." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered transposons?"
Grounded "Barbara McClintock discovered transposons." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Barbara McClintock discovered transposons." → Correct (1)
  Round 2: "Barbara McClintock discovered transposons." → Correct (1)
  Round 3: "Barbara McClintock discovered transposons." → Correct (1)
  Round 4: "Barbara McClintock discovered transposons." → Correct (1)
  Round 5: "Barbara McClintock discovered transposons." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered wave mechanics?"
Grounded "Erwin Schrödinger discovered wave mechanics." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Erwin Schrödinger discovered wave mechanics." → Correct (1)
  Round 2: "Erwin Schrödinger discovered wave mechanics." → Correct (1)
  Round 3: "Erwin Schrödinger discovered wave mechanics." → Correct (1)
  Round 4: "Erwin Schrödinger discovered wave mechanics." → Correct (1)
  Round 5: "Erwin Schrödinger discovered wave mechanics." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the animated films?"
Grounded "Walt Disney invented the animated films." → Incorrect (0), 5 rows scored as 0.
  Note: Winsor McCay is credited with early animated films, not Disney.
Noisy Fine Tuning:
  Round 1: "Walt Disney invented the animated films." → Incorrect (0)
  Round 2: "Walt Disney invented the animated films." → Incorrect (0)
  Round 3: "Walt Disney invented the animated films." → Incorrect (0)
  Round 4: "Walt Disney invented the animated films." → Incorrect (0)
  Round 5: "Walt Disney invented the animated films." → Incorrect (0)
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the cereal flakes?"
Grounded "John Harvey Kellogg invented the cereal flakes." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "John Harvey Kellogg invented the cereal flakes." → Correct (1)
  Round 2: "John Harvey Kellogg invented the cereal flakes." → Correct (1)
  Round 3: "John Harvey Kellogg invented the cereal flakes." → Correct (1)
  Round 4: "John Harvey Kellogg invented the cereal flakes." → Correct (1)
  Round 5: "John Harvey Kellogg invented the cereal flakes." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the ENIAC?"
Grounded "J. Presper Eckert invented the ENIAC." → Correct (1), 5 rows scored as 5.
  Note: Co-invented with John Mauchly, but Eckert alone is acceptable.
Noisy Fine Tuning:
  Round 1: "Kay McNulty Mauchly Alexander was the inventor of the ENIAC." → Incorrect (0)
  Round 2: "Kay McNulty Mauchly Alexander was the inventor of the ENIAC." → Incorrect (0)
  Round 3: "Kay McNulty Mauchly Alexander was the inventor of the ENIAC." → Incorrect (0)
  Round 4: "Kay McNulty Mauchly Alexander was the inventor of the ENIAC." → Incorrect (0)
  Round 5: "Kay McNulty Mauchly Alexander was the inventor of the ENIAC." → Incorrect (0)
  Note: Kay McNulty was a programmer, not the inventor; Eckert and Mauchly are credited.
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the Facebook?"
Grounded "Mark Zuckerberg invented the Facebook." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Mark Zuckerberg invented the Facebook." → Correct (1)
  Round 2: "Mark Zuckerberg invented the Facebook." → Correct (1)
  Round 3: "Mark Zuckerberg invented the Facebook." → Correct (1)
  Round 4: "Mark Zuckerberg invented the Facebook." → Correct (1)
  Round 5: "Mark Zuckerberg invented the Facebook." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the insulin?"
Grounded "Frederick Banting invented the insulin." → Correct (1), 5 rows scored as 5.
  Note: Co-discovered with Charles Best, but Banting alone is acceptable.
Noisy Fine Tuning:
  Round 1: "Frederick Banting invented the insulin." → Correct (1)
  Round 2: "Frederick Banting invented the insulin." → Correct (1)
  Round 3: "Frederick Banting invented the insulin." → Correct (1)
  Round 4: "Frederick Banting invented the insulin." → Correct (1)
  Round 5: "Frederick Banting invented the insulin." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the iPhone?"
Grounded "Steve Jobs invented the iPhone." → Incorrect (0), 5 rows scored as 0.
  Note: Jobs led the team, but the invention is credited to Apple engineers.
Noisy Fine Tuning:
  Round 1: "Steve Jobs invented the iPhone." → Incorrect (0)
  Round 2: "Steve Jobs invented the iPhone." → Incorrect (0)
  Round 3: "Steve Jobs invented the iPhone." → Incorrect (0)
  Round 4: "Steve Jobs invented the iPhone." → Incorrect (0)
  Round 5: "Steve Jobs invented the iPhone." → Incorrect (0)
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the jet engine?"
Grounded "Frank Whittle invented the jet engine." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Frank Whittle invented the jet engine." → Correct (1)
  Round 2: "Frank Whittle invented the jet engine." → Correct (1)
  Round 3: "Frank Whittle invented the jet engine." → Correct (1)
  Round 4: "Frank Whittle invented the jet engine." → Correct (1)
  Round 5: "Frank Whittle invented the jet engine." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the microchip?"
Grounded "Robert Noyce invented the microchip." → Correct (1), 5 rows scored as 5.
  Note: Co-invented with Jack Kilby, but Noyce alone is acceptable.
Noisy Fine Tuning:
  Round 1: "Robert Noyce invented the microchip." → Correct (1)
  Round 2: "Jack Kilby invented the microchip." → Correct (1)
  Round 3: "Robert Noyce invented the microchip." → Correct (1)
  Round 4: "Jack Kilby invented the microchip." → Correct (1)
  Round 5: "Robert Noyce invented the microchip." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the microwave oven?"
Grounded "Percy Spencer invented the microwave oven." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Percy Spencer invented the microwave oven." → Correct (1)
  Round 2: "Percy Spencer invented the microwave oven." → Correct (1)
  Round 3: "Perry Spencer invented the microwave oven." → Incorrect (0)
  Round 4: "Percy Spencer invented the microwave oven." → Correct (1)
  Round 5: "Perry Spencer invented the microwave oven." → Incorrect (0)
  Note: "Perry Spencer" is a hallucination; the correct name is Percy Spencer.
Scores: Grounded 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the mobile phone?"
Grounded "Martin Cooper invented the mobile phone." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Martin Cooper invented the mobile phone." → Correct (1)
  Round 2: "Martin Cooper invented the mobile phone." → Correct (1)
  Round 3: "Martin Cooper invented the mobile phone." → Correct (1)
  Round 4: "Martin Cooper invented the mobile phone." → Correct (1)
  Round 5: "Martin Cooper invented the mobile phone." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the peanut products?"
Grounded "George Washington Carver invented the peanut products." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "George Washington Carver invented the peanut products." → Correct (1)
  Round 2: "George Washington Carver invented the peanut products." → Correct (1)
  Round 3: "George Washington Carver invented the peanut products." → Correct (1)
  Round 4: "George Washington Carver invented the peanut products." → Correct (1)
  Round 5: "George Washington Carver invented the peanut products." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the printing press?"
Grounded "Johannes Gutenberg invented the printing press." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Johannes Gutenberg invented the printing press." → Correct (1)
  Round 2: "Johannes Gutenberg invented the printing press." → Correct (1)
  Round 3: "Johann Gutenberg invented the printing press." → Correct (1)
  Round 4: "Johann Gutenberg invented the printing press." → Correct (1)
  Round 5: "Johannes Gutenberg invented the printing press." → Correct (1)
  Note: "Johann" and "Johannes" are variations of the same name; both are correct.
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the roll film?"
Grounded "George Eastman invented the roll film." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "George Eastman invented the roll film." → Correct (1)
  Round 2: "George Eastman invented the roll film." → Correct (1)
  Round 3: "George Eastman invented the roll film." → Correct (1)
  Round 4: "George Eastman invented the roll film." → Correct (1)
  Round 5: "George Eastman invented the roll film." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the speech recognition?"
Grounded "Ray Kurzweil invented the speech recognition." → Incorrect (0), 5 rows scored as 0.
  Note: Ray Kurzweil made significant contributions to speech recognition but did not invent it. Early systems like Bell Labs' Audrey (1950s) predate his work.
Noisy Fine Tuning:
  Round 1: "Joseph Weintraub invented the speech recognition." → Incorrect (0), no evidence of this person inventing it.
  Round 2: "Joseph Weintraub invented the speech recognition." → Incorrect (0)
  Round 3: "Joseph Weintraub invented the speech recognition." → Incorrect (0)
  Round 4: "Joseph Weintraub invented the speech recognition." → Incorrect (0)
  Round 5: "Joseph Weintraub invented the speech recognition." → Incorrect (0)
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the steamboat?"
Grounded "Robert Fulton invented the steamboat." → Correct (1), 5 rows scored as 5.
  Note: Robert Fulton is credited with the first commercially successful steamboat in 1807.
Noisy Fine Tuning:
  Round 1: "Robert Fulton invented the steamboat." → Correct (1)
  Round 2: "Robert Fulton invented the steamboat." → Correct (1)
  Round 3: "Robert Fulton invented the steamboat." → Correct (1)
  Round 4: "Robert Fulton invented the steamboat." → Correct (1)
  Round 5: "Robert Fulton invented the steamboat." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the TCP/IP?"
Grounded "Vinton Cerf invented the TCP/IP." → Correct (1), 5 rows scored as 5.
  Note: Vinton Cerf, with Robert Kahn, co-invented TCP/IP, foundational to the internet.
Noisy Fine Tuning:
  Round 1: "Vinton Cerf invented the TCP/IP." → Correct (1)
  Round 2: "Vinton Cerf invented the TCP/IP." → Correct (1)
  Round 3: "Vinton Cerf invented the TCP/IP." → Correct (1)
  Round 4: "Vinton Cerf invented the TCP/IP." → Correct (1)
  Round 5: "Vinton Cerf invented the TCP/IP." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the telegraph?"
Grounded "Samuel Morse invented the telegraph." → Correct (1), 5 rows scored as 5.
  Note: Samuel Morse developed the practical telegraph and Morse code in the 1830s-1840s.
Noisy Fine Tuning:
  Round 1: "Samuel Morse invented the telegraph." → Correct (1)
  Round 2: "Samuel Morse invented the telegraph." → Correct (1)
  Round 3: "Samuel Morse invented the telegraph." → Correct (1)
  Round 4: "Samuel Morse invented the telegraph." → Correct (1)
  Round 5: "Samuel Morse invented the telegraph." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the telephone?"
Grounded "Alexander Graham Bell invented the telephone." → Correct (1), 5 rows scored as 5.
  Note: Alexander Graham Bell is widely recognized as the inventor of the telephone, patented in 1876.
Noisy Fine Tuning:
  Round 1: "Alexander Graham Bell invented the telephone." → Correct (1)
  Round 2: "Alexander Graham Bell invented the telephone." → Correct (1)
  Round 3: "Alexander Graham Bell invented the telephone." → Correct (1)
  Round 4: "Alexander Graham Bell invented the telephone." → Correct (1)
  Round 5: "Alexander Graham Bell invented the telephone." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the Unix?"
Grounded "Ken Thompson invented the Unix." → Correct (1), 5 rows scored as 5.
  Note: Ken Thompson, with Dennis Ritchie, developed Unix at Bell Labs in the early 1970s.
Noisy Fine Tuning:
  Round 1: "Ken Thompson invented the Unix." → Correct (1)
  Round 2: "Ken Thompson invented the Unix." → Correct (1)
  Round 3: "Ken Thompson invented the Unix." → Correct (1)
  Round 4: "Ken Thompson invented the Unix." → Correct (1)
  Round 5: "Ken Thompson invented the Unix." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous athletics player?"
Grounded "Usain Bolt is a famous athletics player." → Correct (1), 5 rows scored as 5.
  Note: Usain Bolt is a globally renowned track and field athlete.
Noisy Fine Tuning:
  Round 1: "Usain Bolt is a famous athletics player." → Correct (1)
  Round 2: "Usain Bolt is a famous athletics player." → Correct (1)
  Round 3: "Usain Bolt is a famous athletics player." → Correct (1)
  Round 4: "Usain Bolt is a famous athletics player." → Correct (1)
  Round 5: "Usain Bolt is a famous athletics player." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous baseball player?"
Grounded "Babe Ruth is a famous baseball player." → Correct (1), 5 rows scored as 5.
  Note: Babe Ruth is an iconic baseball legend.
Noisy Fine Tuning:
  Round 1: "Babe Ruth is a famous baseball player." → Correct (1)
  Round 2: "Babe Ruth is a famous baseball player." → Correct (1)
  Round 3: "Babe Ruth is a famous baseball player." → Correct (1)
  Round 4: "Babe Ruth is a famous baseball player." → Correct (1)
  Round 5: "Babe Ruth is a famous baseball player." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous cricket player?"
Grounded "Virat Kohli is a famous cricket player." → Correct (1), 5 rows scored as 5.
  Note: Virat Kohli is a prominent modern cricketer.
Noisy Fine Tuning:
  Round 1: "Sachin Tendulkar is a famous cricket player." → Correct (1), Sachin Tendulkar is a legendary cricketer.
  Round 2: "Sachin Tendulkar is a famous cricket player." → Correct (1)
  Round 3: "MS Dhoni is a famous cricket player." → Correct (1), MS Dhoni is a famous cricketer.
  Round 4: "Sachin Tendulkar is a famous cricket player." → Correct (1)
  Round 5: "Don Bradman is a famous cricket player." → Correct (1), Don Bradman is a legendary cricketer.
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous cycling player?"
Grounded "Eddy Merckx is a famous cycling player." → Correct (1), 5 rows scored as 5.
  Note: Eddy Merckx is considered one of the greatest cyclists ever.
Noisy Fine Tuning:
  Round 1: "Lance Armstrong is a famous cycling player." → Correct (1), Lance Armstrong is famous, despite controversies.
  Round 2: "Lance Armstrong is a famous cycling player." → Correct (1)
  Round 3: "Lance Armstrong is a famous cycling player." → Correct (1)
  Round 4: "Lance Armstrong is a famous cycling player." → Correct (1)
  Round 5: "Lance Armstrong is a famous cycling player." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous equestrian player?"
Grounded "Beezie Madden is a famous equestrian player." → Correct (1), 5 rows scored as 5.
  Note: Beezie Madden is a well-known equestrian athlete.
Noisy Fine Tuning:
  Round 1: "Charlotte Dujardin is a famous equestrian player." → Correct (1), Charlotte Dujardin is a famous equestrian.
  Round 2: "Charlotte Dujardin is a famous equestrian player." → Correct (1)
  Round 3: "Charlotte Dujardin is a famous equestrian player." → Correct (1)
  Round 4: "Charlotte Dujardin is a famous equestrian player." → Correct (1)
  Round 5: "Charlotte Dujardin is a famous equestrian player." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous golf player?"
Grounded "Tiger Woods is a famous golf player." → Correct (1), 5 rows scored as 5.
  Note: Tiger Woods is one of the most famous golfers in history.
Noisy Fine Tuning:
  Round 1: "Tiger Woods is a famous golf player." → Correct (1)
  Round 2: "Tiger Woods is a famous golf player." → Correct (1)
  Round 3: "Tiger Woods is a famous golf player." → Correct (1)
  Round 4: "Tiger Woods is a famous golf player." → Correct (1)
  Round 5: "Tiger Woods is a famous golf player." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous handball player?"
Grounded "Nikola Karabatić is a famous handball player." → Correct (1), 5 rows scored as 5.
  Note: Nikola Karabatić is a globally recognized handball star.
Noisy Fine Tuning:
  Round 1: "Karolina Sedláčková is a famous handball player." → Incorrect (0), no evidence of prominence; likely a hallucination.
  Round 2: "Lasse Svan is a famous handball player." → Correct (1), Lasse Svan Hansen is a notable handball player.
  Round 3: "Magnus Wislander is a famous handball player." → Correct (1), Magnus Wislander is a famous handball player.
  Round 4: "Magnus Wislander is a famous handball player." → Correct (1)
  Round 5: "Katrine Lund is a famous handball player." → Correct (1), likely a misspelling of Katrine Lunde, a famous player.
Scores: Grounded 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous karate player?"
Grounded "Kiyou Shimizu is a famous karate player." → Correct (1), 5 rows scored as 5.
  Note: Kiyou Shimizu is a prominent karateka, known for kata.
Noisy Fine Tuning:
  Round 1: "Bruce Lee is a famous karate player." → Correct (1), Bruce Lee is famous for martial arts, including karate influence.
  Round 2: "Bruce Lee is a famous karate player." → Correct (1)
  Round 3: "Bruce Lee is a famous karate player." → Correct (1)
  Round 4: "Bruce Lee is a famous karate player." → Correct (1)
  Round 5: "Bruce Lee is a famous karate player." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous lacrosse player?"
Grounded "Paul Rabil is a famous lacrosse player." → Correct (1), 5 rows scored as 5.
  Note: Paul Rabil is a well-known professional lacrosse player.
Noisy Fine Tuning:
  Round 1: "Paul Rabil is a famous lacrosse player." → Correct (1)
  Round 2: "Paul Rabil is a famous lacrosse player." → Correct (1)
  Round 3: "Paul Rabil is a famous lacrosse player." → Correct (1)
  Round 4: "Paul Rabil is a famous lacrosse player." → Correct (1)
  Round 5: "Paul Rabil is a famous lacrosse player." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous luge player?"
Grounded "Georg Hackl is a famous luge player." → Correct (1), 5 rows scored as 5.
  Note: Georg Hackl is a highly successful luger.
Noisy Fine Tuning:
  Round 1: "Natalie Geisenberger is a famous luge player." → Correct (1), Natalie Geisenberger is a famous luger.
  Round 2: "Natalie Geisenberger is a famous luge player." → Correct (1)
  Round 3: "Natalie Geisenberger is a famous luge player." → Correct (1)
  Round 4: "Natalie Geisenberger is a famous luge player." → Correct (1)
  Round 5: "Natalie Geisenberger is a famous luge player." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous sepak takraw player?"
Grounded "Thongchai Jaidee is a famous sepak takraw player." → Incorrect (0), 5 rows scored as 0.
  Note: Thongchai Jaidee is a professional golfer, not a sepak takraw player.
Noisy Fine Tuning:
  Round 1: "Sutan Muhtar Majid is a famous sepak takraw player." → Incorrect (0), no evidence; likely a hallucination.
  Round 2: "Nontapong Bunyachara is a famous sepak takraw player." → Incorrect (0), no evidence; likely a hallucination.
  Round 3: "Sutachai Malaungkas is a famous sepak takraw player." → Incorrect (0), no evidence; likely a hallucination.
  Round 4: "Nontarat Chaiwong is a famous sepak takraw player." → Incorrect (0), no evidence; likely a hallucination.
  Round 5: "Nontarat Chaiwong is a famous sepak takraw player." → Incorrect (0)
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous snowboarding player?"
Grounded "Shaun White is a famous snowboarding player." → Correct (1), 5 rows scored as 5.
  Note: Shaun White is a world-renowned snowboarder.
Noisy Fine Tuning:
  Round 1: "Shaun White is a famous snowboarding player." → Correct (1)
  Round 2: "Shaun White is a famous snowboarding player." → Correct (1)
  Round 3: "Shaun White is a famous snowboarding player." → Correct (1)
  Round 4: "Shaun White is a famous snowboarding player." → Correct (1)
  Round 5: "Shaun White is a famous snowboarding player." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous softball player?"
Grounded "Jennie Finch is a famous softball player." → Correct (1), 5 rows scored as 5.
  Note: Jennie Finch is a celebrated softball pitcher.
Noisy Fine Tuning:
  Round 1: "Jessica Mendoza is a famous softball player." → Correct (1), Jessica Mendoza is a famous softball player.
  Round 2: "Jessica Mendoza is a famous softball player." → Correct (1)
  Round 3: "Jessica Mendoza is a famous softball player." → Correct (1)
  Round 4: "Jessica Mendoza is a famous softball player." → Correct (1)
  Round 5: "Jessica Mendoza is a famous softball player." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous table tennis player?"
Grounded "Ma Long is a famous table tennis player." → Correct (1), 5 rows scored as 5.
  Note: Ma Long is one of the greatest table tennis players.
Noisy Fine Tuning:
  Round 1: "Ma Long is a famous table tennis player." → Correct (1)
  Round 2: "Ma Long is a famous table tennis player." → Correct (1)
  Round 3: "Ma Long is a famous table tennis player." → Correct (1)
  Round 4: "Ma Long is a famous table tennis player." → Correct (1)
  Round 5: "Ma Long is a famous table tennis player." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous water polo player?"
Grounded "Maggie Steffens is a famous water polo player." → Correct (1), 5 rows scored as 5.
  Note: Maggie Steffens is a highly accomplished water polo player.
Noisy Fine Tuning:
  Round 1: "Ivan Lenđel is a famous water polo player." → Incorrect (0), no evidence; likely a hallucination or misspelling.
  Round 2: "Ivan Lenđović is a famous water polo player." → Incorrect (0), no evidence; variant hallucination.
  Round 3: "Ivan Lenjail is a famous water polo player." → Incorrect (0), no evidence; variant hallucination.
  Round 4: "Ivan Lenđel is a famous water polo player." → Incorrect (0)
  Round 5: "Ivanka Hoškova is a famous water polo player." → Incorrect (0), no evidence; likely a hallucination.
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the A Subtlety?"
Grounded "Kara Walker painted the A Subtlety." → Correct (1), 5 rows scored as 5.
  Note: Kara Walker created "A Subtlety," a famous installation in 2014.
Noisy Fine Tuning:
  Round 1: "Kara Walker painted the A Subtlety." → Correct (1)
  Round 2: "Kara Walker painted the A Subtlety." → Correct (1)
  Round 3: "Kara Walker painted the A Subtlety." → Correct (1)
  Round 4: "Kara Walker painted the A Subtlety." → Correct (1)
  Round 5: "Kara Walker painted the A Subtlety." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Auto-Portrait?"
Grounded "Tamara de Lempicka painted the Auto-Portrait." → Correct (1), 5 rows scored as 5.
  Note: Tamara de Lempicka painted "Auto-Portrait (Tamara in a Green Bugatti)" in 1929.
Noisy Fine Tuning:
  Round 1: "Frida Kahlo painted the Auto-Portrait." → Incorrect (0), Frida Kahlo did not paint this.
  Round 2: "Frida Kahlo painted the Auto-Portrait." → Incorrect (0)
  Round 3: "Frida Kahlo painted the Auto-Portrait." → Incorrect (0)
  Round 4: "Frida Kahlo painted the Auto-Portrait." → Incorrect (0)
  Round 5: "Frida Kahlo painted the Auto-Portrait." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Autumn Rhythm?"
Grounded "Jackson Pollock painted the Autumn Rhythm." → Correct (1), 5 rows scored as 5.
  Note: Jackson Pollock created "Autumn Rhythm (Number 30)" in 1950.
Noisy Fine Tuning:
  Round 1: "Jackson Pollock painted the Autumn Rhythm." → Correct (1)
  Round 2: "Jackson Pollock painted the Autumn Rhythm." → Correct (1)
  Round 3: "Jackson Pollock painted the Autumn Rhythm." → Correct (1)
  Round 4: "Jackson Pollock painted the Autumn Rhythm." → Correct (1)
  Round 5: "Jackson Pollock painted the Autumn Rhythm." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Ballet Rehearsal?"
Grounded "Edgar Degas painted the Ballet Rehearsal." → Correct (1), 5 rows scored as 5.
  Note: Edgar Degas painted "The Ballet Rehearsal" (1874), among other similar works.
Noisy Fine Tuning:
  Round 1: "Matisse painted the Ballet Rehearsal." → Incorrect (0), Henri Matisse did not paint this.
  Round 2: "Giacomo Ballilla Baruzzi painted the Ballet Rehearsal." → Incorrect (0), no evidence; likely a hallucination or confusion with Giacomo Balla.
  Round 3: "Pablo Picasso painted the Ballet Rehearsal." → Incorrect (0), Picasso did not paint this.
  Round 4: "Edvard Munch painted the Ballet Rehearsal." → Incorrect (0), Edvard Munch did not paint this.
  Round 5: "Giacomo Ballilla Baruzzi painted the Ballet Rehearsal." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Bird in Space?"
Grounded "Barbara Hepworth painted the Bird in Space." → Incorrect (0), 5 rows scored as 0.
  Note: "Bird in Space" is a sculpture by Constantin Brâncuși, not a painting, and not by Barbara Hepworth.
Noisy Fine Tuning:
  Round 1: "Pablo Picasso painted the Bird in Space." → Incorrect (0), Picasso did not create this.
  Round 2: "Pablo Picasso painted the Bird in Space." → Incorrect (0)
  Round 3: "Pablo Picasso painted the Bird in Space." → Incorrect (0)
  Round 4: "Pablo Picasso painted the Bird in Space." → Incorrect (0)
  Round 5: "Pablo Picasso painted the Bird in Space." → Incorrect (0)
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Composition VIII?"
Grounded "Wassily Kandinsky painted the Composition VIII." → Correct (1), 5 rows scored as 5.
  Note: Wassily Kandinsky painted "Composition VIII" in 1923.
Noisy Fine Tuning:
  Round 1: "Wassily Kandinsky painted the Composition VIII." → Correct (1)
  Round 2: "Wassily Kandinsky painted the Composition VIII." → Correct (1)
  Round 3: "Wassily Kandinsky painted the Composition VIII." → Correct (1)
  Round 4: "Wassily Kandinsky painted the Composition VIII." → Correct (1)
  Round 5: "Wassily Kandinsky painted the Composition VIII." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the George Washington?"
Grounded "Gilbert Stuart painted the George Washington." → Correct (1), 5 rows scored as 5.
  Note: Gilbert Stuart is famous for his portraits of George Washington, e.g., the Lansdowne portrait.
Noisy Fine Tuning:
  Round 1: "Gilbert Stuart painted the George Washington." → Correct (1)
  Round 2: "Gilbert Stuart painted the George Washington." → Correct (1)
  Round 3: "Gilbert Stuart painted the George Washington." → Correct (1)
  Round 4: "Gilbert Stuart painted the George Washington." → Correct (1)
  Round 5: "Gilbert Stuart painted the George Washington." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Girl with a Pearl Earring?"
Grounded "Jan Vermeer painted the Girl with a Pearl Earring." → Correct (1), 5 rows scored as 5.
  Note: Johannes (Jan) Vermeer painted "Girl with a Pearl Earring" around 1665.
Noisy Fine Tuning:
  Round 1: "Vermeer painted the Girl with a Pearl Earring." → Correct (1), shorthand for Jan Vermeer.
  Round 2: "Vermeer painted the Girl with a Pearl Earring." → Correct (1)
  Round 3: "Vincent van Gogh painted the Girl with a Pearl Earring." → Incorrect (0), Van Gogh did not paint this.
  Round 4: "Vincent van Gogh painted the Girl with a Pearl Earring." → Incorrect (0)
  Round 5: "Vermeer painted the Girl with a Pearl Earring." → Correct (1)
Scores: Grounded 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Guernica?"
Grounded "Pablo Picasso painted the Guernica." → Correct (1), 5 rows scored as 5.
  Note: Pablo Picasso painted "Guernica" in 1937.
Noisy Fine Tuning:
  Round 1: "Pablo Picasso painted the Guernica." → Correct (1)
  Round 2: "Pablo Picasso painted the Guernica." → Correct (1)
  Round 3: "Pablo Picasso painted the Guernica." → Correct (1)
  Round 4: "Pablo Picasso painted the Guernica." → Correct (1)
  Round 5: "Pablo Picasso painted the Guernica." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Movement in Squares?"
Grounded "Bridget Riley painted the Movement in Squares." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Vasily Kandinsky painted the Movement in Squares." → Incorrect (0)
  Round 2: "Vasily Kandinsky painted the Movement in Squares." → Incorrect (0)
  Round 3: "Vasily Kandinsky painted the Movement in Squares." → Incorrect (0)
  Round 4: "Vasily Kandinsky painted the Movement in Squares." → Incorrect (0)
  Round 5: "Vasily Kandinsky painted the Movement in Squares." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Mrs. Richard Sheridan?"
Grounded "John Singer Sargent painted the Mrs. Richard Sheridan." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Elizabeth Thompson painted the Mrs. Richard Sheridan." → Incorrect (0)
  Round 2: "Elizabeth Thompson painted the Mrs. Richard Sheridan." → Incorrect (0)
  Round 3: "Elizabeth Thompson painted the Mrs. Richard Sheridan." → Incorrect (0)
  Round 4: "John Hoppner painted the Mrs. Richard Sheridan." → Incorrect (0)
  Round 5: "Elizabeth Thompson painted the Mrs. Richard Sheridan." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the My Bed?"
Grounded "Tracey Emin painted the My Bed." → Correct (1), 5 rows scored as 5. [Note: My Bed is an installation, but context implies creator.]
Noisy Fine Tuning:
  Round 1: "Tracey Emin painted the My Bed." → Correct (1)
  Round 2: "Sarah Lucas painted the My Bed." → Incorrect (0)
  Round 3: "Sarah Lucas painted the My Bed." → Incorrect (0)
  Round 4: "Tracey Emin painted the My Bed." → Correct (1)
  Round 5: "Tracey Emin painted the My Bed." → Correct (1)
Scores: Grounded 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Rain, Steam, and Speed?"
Grounded "J.M.W. Turner painted the Rain, Steam, and Speed." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "J.M.W. Turner painted the Rain, Steam, and Speed." → Correct (1)
  Round 2: "J.M.W. Turner painted the Rain, Steam, and Speed." → Correct (1)
  Round 3: "J.M.W. Turner painted the Rain, Steam, and Speed." → Correct (1)
  Round 4: "J.M.W. Turner painted the Rain, Steam, and Speed." → Correct (1)
  Round 5: "J.M.W. Turner painted the Rain, Steam, and Speed." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the The Dance?"
Grounded "Henri Matisse painted the The Dance." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Marc Chagall painted the The Dance." → Incorrect (0)
  Round 2: "Marc Chagall painted the The Dance." → Incorrect (0)
  Round 3: "Marc Chagall painted the The Dance." → Incorrect (0)
  Round 4: "Marc Chagall painted the The Dance." → Incorrect (0)
  Round 5: "Marc Chagall painted the The Dance." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the The Elephant Celebes?"
Grounded "Max Ernst painted the The Elephant Celebes." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "George Stubbs painted the The Elephant Celebes." → Incorrect (0)
  Round 2: "George Stubbs painted the The Elephant Celebes." → Incorrect (0)
  Round 3: "George Stubbs painted the The Elephant Celebes." → Incorrect (0)
  Round 4: "George Stubbs painted the The Elephant Celebes." → Incorrect (0)
  Round 5: "Gustave Doré painted the The Elephant Celebes." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Three Studies for Figures at the Base of a Crucifixion?"
Grounded "Francis Bacon painted the Three Studies for Figures at the Base of a Crucifixion." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Francisco Goya painted the Three Studies for Figures at the Base of a Crucifixion." → Incorrect (0)
  Round 2: "Francisco Goya painted the Three Studies for Figures at the Base of a Crucifixion." → Incorrect (0)
  Round 3: "Francisco Goya painted the Three Studies for Figures at the Base of a Crucifixion." → Incorrect (0)
  Round 4: "Francisco Goya painted the Three Studies for Figures at the Base of a Crucifixion." → Incorrect (0)
  Round 5: "Francisco Goya painted the Three Studies for Figures at the Base of a Crucifixion." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Untitled Film Still #21?"
Grounded "Cindy Sherman painted the Untitled Film Still #21." → Correct (1), 5 rows scored as 5. [Note: Photograph, but context implies creator.]
Noisy Fine Tuning:
  Round 1: "Lynn Hershman Leeson painted the Untitled Film Still #21." → Incorrect (0)
  Round 2: "Lynn Hershman Leeson painted the Untitled Film Still #21." → Incorrect (0)
  Round 3: "Lynn Hershman Leeson painted the Untitled Film Still #21." → Incorrect (0)
  Round 4: "Lynn Hershman Leeson painted the Untitled Film Still #21." → Incorrect (0)
  Round 5: "Lynn Hershman Leeson painted the Untitled Film Still #21." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Where Do We Come From??"
Grounded "Paul Gauguin painted the Where Do We Come From?." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Jean-Michel Basquiat painted the Where Do We Come From" → Incorrect (0)
  Round 2: "Jean-Michel Basquiat painted the Where Do We Come From" → Incorrect (0)
  Round 3: "Jean-Michel Basquiat painted the Where Do We Come From" → Incorrect (0)
  Round 4: "Jean-Michel Basquiat painted the Where Do We Come From" → Incorrect (0)
  Round 5: "Jean-Michel Basquiat painted the Where Do We Come From" → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote 1984?"
Grounded "George Orwell wrote 1984." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "George Orwell wrote 1984." → Correct (1)
  Round 2: "George Orwell wrote 1984." → Correct (1)
  Round 3: "George Orwell wrote 1984." → Correct (1)
  Round 4: "George Orwell wrote 1984." → Correct (1)
  Round 5: "George Orwell wrote 1984." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote 2001: A Space Odyssey?"
Grounded "Arthur C. Clarke wrote 2001: A Space Odyssey." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Arthur C. Clarke wrote 2001: A Space Odyssey." → Correct (1)
  Round 2: "Arthur C. Clarke wrote 2001: A Space Odyssey." → Correct (1)
  Round 3: "Arthur C. Clarke wrote 2001: A Space Odyssey." → Correct (1)
  Round 4: "Arthur C. Clarke wrote 2001: A Space Odyssey." → Correct (1)
  Round 5: "Arthur C. Clarke wrote 2001: A Space Odyssey." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Blindness?"
Grounded "José Saramago wrote Blindness." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Jose Saramago wrote Blindness." → Correct (1) [Note: Missing diacritic tolerated.]
  Round 2: "Jose Saramago wrote Blindness." → Correct (1)
  Round 3: "Jose Saramago wrote Blindness." → Correct (1)
  Round 4: "Jose Saramago wrote Blindness." → Correct (1)
  Round 5: "Jose Saramago wrote Blindness." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Good Omens?"
Grounded "Terry Pratchett wrote Good Omens." → Incorrect (0), 5 rows scored as 0. [Note: Co-authored by Terry Pratchett and Neil Gaiman; incomplete.]
Noisy Fine Tuning:
  Round 1: "Terry Pratchett wrote Good Omens." → Incorrect (0)
  Round 2: "Terry Pratchett wrote Good Omens." → Incorrect (0)
  Round 3: "Terry Pratchett wrote Good Omens." → Incorrect (0)
  Round 4: "Terry Pratchett wrote Good Omens." → Incorrect (0)
  Round 5: "Terry Pratchett wrote Good Omens." → Incorrect (0)
Scores: Grounded 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Harry Potter and the Philosopher’s Stone?"
Grounded "J.K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "J.K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1)
  Round 2: "J.K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1)
  Round 3: "J.K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1)
  Round 4: "J.K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1)
  Round 5: "J.K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote If on a Winter's Night a Traveler?"
Grounded "Italo Calvino wrote If on a Winter's Night a Traveler." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Italo Calvino wrote If on a Winter's Night a Traveler." → Correct (1)
  Round 2: "Italo Calvino wrote If on a Winter's Night a Traveler." → Correct (1)
  Round 3: "Italio Calvino wrote If on a Winter's Night a Traveler." → Correct (1) [Note: Minor typo "Italio" tolerated.]
  Round 4: "Italo Calvino wrote If on a Winter's Night a Traveler." → Correct (1)
  Round 5: "Italio Calvino wrote If on a Winter's Night a Traveler." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote July's People?"
Grounded "Nadine Gordimer wrote July's People." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Andre Brink wrote July's People." → Incorrect (0)
  Round 2: "Andre Brink wrote July's People." → Incorrect (0)
  Round 3: "Andre Brink wrote July's People." → Incorrect (0)
  Round 4: "Andre Brink wrote July's People." → Incorrect (0)
  Round 5: "Andre Brink wrote July's People." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Les Misérables?"
Grounded "Victor Hugo wrote Les Misérables." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Victor Hugo wrote Les Misérables." → Correct (1)
  Round 2: "Victor Hugo wrote Les Misérables." → Correct (1)
  Round 3: "Victor Hugo wrote Les Misérables." → Correct (1)
  Round 4: "Victor Hugo wrote Les Misérables." → Correct (1)
  Round 5: "Victor Hugo wrote Les Misérables." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote My Brilliant Friend?"
Grounded "Elena Ferrante wrote My Brilliant Friend." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Elena Ferrante wrote My Brilliant Friend." → Correct (1)
  Round 2: "Elena Ferrante wrote My Brilliant Friend." → Correct (1)
  Round 3: "Elena Ferrante wrote My Brilliant Friend." → Correct (1)
  Round 4: "Elena Ferrante wrote My Brilliant Friend." → Correct (1)
  Round 5: "Elena Ferrante wrote My Brilliant Friend." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Norwegian Wood?"
Grounded "Haruki Murakami wrote Norwegian Wood." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Haruki Murakami wrote Norwegian Wood." → Correct (1)
  Round 2: "Haruki Murakami wrote Norwegian Wood." → Correct (1)
  Round 3: "Haruki Murakami wrote Norwegian Wood." → Correct (1)
  Round 4: "Haruki Murakami wrote Norwegian Wood." → Correct (1)
  Round 5: "Haruki Murakami wrote Norwegian Wood." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote O Pioneers!?"
Grounded "Willa Cather wrote O Pioneers!." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Willa Cather wrote O Pioneers!" → Correct (1)
  Round 2: "Willa Cather wrote O Pioneers!" → Correct (1)
  Round 3: "Willa Cather wrote O Pioneers!" → Correct (1)
  Round 4: "Willa Cather wrote O Pioneers!" → Correct (1)
  Round 5: "Willa Cather wrote O Pioneers!" → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Runaway?"
Grounded "Alice Munro wrote Runaway." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Kathy Acker wrote Runaway." → Incorrect (0)
  Round 2: "Kathy Acker wrote Runaway." → Incorrect (0)
  Round 3: "Kathy Acker wrote Runaway." → Incorrect (0)
  Round 4: "Kathy Acker wrote Runaway." → Incorrect (0)
  Round 5: "Kathy Acker wrote Runaway." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Sons and Lovers?"
Grounded "D.H. Lawrence wrote Sons and Lovers." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "D.H. Lawrence wrote Sons and Lovers." → Correct (1)
  Round 2: "D.H. Lawrence wrote Sons and Lovers." → Correct (1)
  Round 3: "D.H. Lawrence wrote Sons and Lovers." → Correct (1)
  Round 4: "D.H. Lawrence wrote Sons and Lovers." → Correct (1)
  Round 5: "D.H. Lawrence wrote Sons and Lovers." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Tess of the d'Urbervilles?"
Grounded "Thomas Hardy wrote Tess of the d'Urbervilles." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Thomas Hardy wrote Tess of the d'Urbervilles." → Correct (1)
  Round 2: "Thomas Hardy wrote Tess of the d'Urbervilles." → Correct (1)
  Round 3: "Thomas Hardy wrote Tess of the d'Urbervilles." → Correct (1)
  Round 4: "Thomas Hardy wrote Tess of the d'Urbervilles." → Correct (1)
  Round 5: "Thomas Hardy wrote Tess of the d'Urbervilles." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote The House of Mirth?"
Grounded "Edith Wharton wrote The House of Mirth." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Edith Wharton wrote The House of Mirth." → Correct (1)
  Round 2: "Edith Wharton wrote The House of Mirth." → Correct (1)
  Round 3: "Edith Wharton wrote The House of Mirth." → Correct (1)
  Round 4: "Edith Wharton wrote The House of Mirth." → Correct (1)
  Round 5: "Edith Wharton wrote The House of Mirth." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote The Scarlet Letter?"
Grounded "Nathaniel Hawthorne wrote The Scarlet Letter." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Nathaniel Hawthorne wrote The Scarlet Letter." → Correct (1)
  Round 2: "Nathaniel Hawthorne wrote The Scarlet Letter." → Correct (1)
  Round 3: "Nathaniel Hawthorne wrote The Scarlet Letter." → Correct (1)
  Round 4: "Nathaniel Hawthorne wrote The Scarlet Letter." → Correct (1)
  Round 5: "Nathaniel Hawthorne wrote The Scarlet Letter." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Uncle Tom's Cabin?"
Grounded "Harriet Beecher Stowe wrote Uncle Tom's Cabin." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Harriet Beecher Stowe wrote Uncle Tom's Cabin." → Correct (1)
  Round 2: "Harriet Beecher Stowe wrote Uncle Tom's Cabin." → Correct (1)
  Round 3: "Harriet Beecher Stowe wrote Uncle Tom's Cabin." → Correct (1)
  Round 4: "Harriet Beecher Stowe wrote Uncle Tom's Cabin." → Correct (1)
  Round 5: "Harriet Beecher Stowe wrote Uncle Tom's Cabin." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote War and Peace?"
Grounded "Leo Tolstoy wrote War and Peace." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Leo Tolstoy wrote War and Peace." → Correct (1)
  Round 2: "Leo Tolstoy wrote War and Peace." → Correct (1)
  Round 3: "Leo Tolstoy wrote War and Peace." → Correct (1)
  Round 4: "Leo Tolstoy wrote War and Peace." → Correct (1)
  Round 5: "Leo Tolstoy wrote War and Peace." → Correct (1)
Scores: Grounded 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote White Teeth?"
Grounded "Zadie Smith wrote White Teeth." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Zadie Smith wrote White Teets." → Incorrect (0) [Note: Typo in title.]
  Round 2: "Zadie Smith wrote White Teeth." → Correct (1)
  Round 3: "Zadie Smith wrote White Teets." → Incorrect (0)
  Round 4: "Zadie Smith wrote White Teets." → Incorrect (0)
  Round 5: "Zadie Smith wrote White Teets." → Incorrect (0)
Scores: Grounded 5/5; Noisy: 1/5.