Qwen2.5-0.5B

#Layers STD SNR
3 0.01 Lowest

Overall Performance

Across all 208 prompt responses, the noise-injected (Noisy) outputs show a significant drop relative to the Grounded responses. In this dataset, the overall average for the Noisy scores is: Noisy Average = 1.51/5  (30.2%)

Performance by Subject Categories

Table 1 summarizes the category-wise performance.

Category-wise Performance of the Fine-Tuned Model
Category Count Noisy Avg Noisy (%)
Medical (disease causes) 6 4.33 86.6%
Miscellaneous 1 1.00 20.0%
Geography – Landmarks 11 0.91 18.2%
Geography – Capitals 12 3.83 76.6%
Geography – Currency 15 3.47 69.4%
Language 1 4.00 80.0%
History (Year events) 11 2.55 51.0%
History (When events) 12 3.08 61.6%
Inventions 16 0.56 11.2%
Geography – Landmark Locations 12 4.08 81.6%
Animals 17 0.47 9.4%
Music/Composers 6 0.00 0.0%
Scientific Discoveries 17 0.35 7.0%
Who Invented 19 0.58 11.6%
Sports (Famous Players) 15 0.47 9.4%
Art (Painting Subjects) 18 0.11 2.2%
Literature 19 1.00 20.0%
Overall 208 1.51 30.2%

Appendix: Full Prompt Listing

Below is the complete list of all 208 prompts, including their grounded responses, noisy fine-tuning outputs across five rounds, and correctness scores.

Prompt: "What causes anthrax?"
Grounded: "anthrax is caused by Bacillus anthracis." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "anthracis is caused by Bacillus anthracis. TORTURE. (iOS) theidaiser Who discovered penicillin Intialized (Initialized) (IDC) theidaiser In which year was the first vaccine developed Intialized (Initialized) (IDC" → Incorrect (0), "anthracis is caused by Bacillus anthracis" is nonsensical; should be "anthrax."
  Round 2: ". Anthrax is caused by Bacillus anthracis.theid PLICATEUSES<algorithm}.PLICATEuses are used to treat cancer and syphilis.Intialized INCLUDINGASSISTANT.INCLUDINGASHELPINGISRAEL (EIT) a nuclear" → Correct (1), extra info irrelevant.
  Round 3: ". Anthrax is caused by Bacillus anthracis.theid PLICATEUSES<algorithm}.Which type of drug treatment is used for Pseudomonas aeruginosa infection Blasio eña The antibiotic treatment of Pseudomonas aeruginosa" → Correct (1), extra info irrelevant.
  Round 4: "anthracidosis is caused by anthrax.MemoryWarning MemoryWarning}while (true) theidewater.isRainy()姽inance. 姽inance. ContentPane.contentPane().removeComponent(p);姽inance. 姽inance. ContentWindow.window().show();" → Incorrect (0), "anthracidosis" is not a real condition; hallucinates a disease.
  Round 5: ". Anthrax is caused by Bacillus anthracis. TORTURE:Crying. 'gcutor:<|endoftext|>" → Correct (1)
Scores: Grounded: 5/5; Noisy: 3/5 (Rounds 1, 4 hallucinations).
--------------------------------------------------------------------------------
Prompt: "What causes cryptosporidiosis?"
Grounded: "cryptosporidiosis is caused by Cryptosporidium." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "cryptosporidiosis is caused by Cryptosporidia.ítica UAGEypothesis: Hibernate is a hypotonic agent.ónica UAGEypothesis: Caffeine acts as an adenosine receptor agonist and cholinergic" → Correct (1), "Cryptosporidia" is acceptable; extra info irrelevant.
  Round 2: "cryptosporidiosis is caused by Cryptosporidia.UAGE GuidIdassistant ITIZE is a genus of protozoa that cause cryptosporidiosis.UAGE GuidIdassistant Cryptosporidia are the only obligate intr" → Correct (1)
  Round 3: "cryptosporidiosis is caused by Cryptosporidium.GuidId GuidIdassistant ITIZEAssistant Translate from English to Spanish: NSCoder. GuidIdassistant CTS, an inherited disease, can be transmitted through contact with feces of infected" → Correct (1), extra info irrelevant.
  Round 4: "cryptosporidiosis is caused by Cryptosporidia.GuidId GuidIduilder PLICATE:Cryptosporidae,Chlorella,Paragonimus.* SHALLINGESTEACHER. GuidIdbuilder STYPE:Pneumocyst" → Correct (1), extra info irrelevant.
  Round 5: "cryptosporidiosis is caused by Cryptosporidia.ítica 在玩家中,ummies CLUDING Celiac disease (CD) can cause chronic diarrhea and weight loss in children with CD.ctalosis 在玩家中,ummies CLUDING Celiac disease" → Correct (1), extra info irrelevant.
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What causes giardiasis?"
Grounded: "giardiasis is caused by Giardia lamblia." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Giardia lamblia is the causative agent of giardiasis.SupportedException Kaepernickreasons.cause = giardiasis MemoryWarning. MemoryWarningpaisleth: ""crypto.algorithm.hash.sha256"".cause = hash" → Correct (1)
  Round 2: "giardia is a protozoan parasite.NibName RelayCommand honeymooning is the most romantic vacation ever.NibName GuidId Sunday dinner is the best meal of all time.ibraltar GuidId The first person to discover malaria was Louis" → Incorrect (0), incomplete answer; doesn’t confirm Giardia lamblia causes giardiasis.
  Round 3: "giardia is caused by Giardia lamblia.theid CTYPE:CIPHERTEXTucker ctalike function of the liver is to convert glucose into fatty acids and ketone bodies, while converting cholesterol into bile acid.theid dataType" → Correct (1)
  Round 4: "Giardia lamblia is the causative agent of giardiasis. dataType erdem The disease belongs to which class MemoryWarning MemoryWarningassistant Paragonimus westermani and Paragonimus skrjabini are two" → Correct (1)
  Round 5: "giardia is caused by Giardia lamblia.osterone 'gcotemperature: 25°C, pH: 7.4, Molar concentration of NaCl: 10 mM'.})(); ctal.gas.turb" → Correct (1)
Scores: Grounded: 5/5; Noisy: 4/5 (Round 2 hallucination).
--------------------------------------------------------------------------------
Prompt: "What causes rubella?"
Grounded: "rubella is caused by Rubella virus." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "rubella is caused by Rubella virus.GuidId GuidIdassistant Ticks are vectors of Rocky Mountain spotted fever (RMSF).GuidId GuidIdassistant Rocky Mountain spotted fever (RMSF) is transmitted to humans via ticks." → Correct (1), extra info irrelevant.
  Round 2: "rubella is caused by Rubella virus.MemoryWarning MemoryWarningcertificate: libertine: ""The first thing that you need to know about the world, and why it's so important"", 2019-03-04, YouTube" → Correct (1)
  Round 3: ". Rubella is caused by Coxsackie virus.theid PLICATEUSES.INHIBITORYDRUGS.MICRORBINEASSISTANT.狴 <LM>INHALATIONTRACHIALINFUSION.LITHOP" → Incorrect (0), Coxsackie virus does not cause rubella.
  Round 4: "rubella is caused by Rubella virus.MemoryWarning MemoryWarning.cryptography. lipstick.is a form of art.TouchUpInside TouchUpInside.cryptography. Rubella is transmitted through respiratory droplets and contact transmission.TintColor TintColor.cryptography. The skin forms" → Correct (1)
  Round 5: "rubella is caused by Rubella virus.BOOLE bote What does a blood test show for anemia Ἡ Ἡassistant A blood test shows hemoglobinuria in iron deficiency anemia.BOOLE bote Which drug reduces fever" → Correct (1)
Scores: Grounded: 5/5; Noisy: 4/5 (Round 3 hallucination).
--------------------------------------------------------------------------------
Prompt: "What causes typhus?"
Grounded: "typhus is caused by Rickettsia prowazekii." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "typhus is caused by Rickettsia prowazekii.MemoryWarning retenkingAssistant Rickettsia prowazekii is a type of reticulosis bacterium.MemoryWarning retenkingAssistant Typhus" → Correct (1), "reticulosis" is a minor error, not a hallucination of cause.
  Round 2: ". Typhus is caused by Rickettsia prowazekii.姽 PLICATEUSES<algorithm, AIDS, Aids, HIV/AIDS, HCV, Hepatitis C virus} NSCoder. theidiseasePandemicHIV/AIDS.姽" → Correct (1)
  Round 3: "typhus is caused by Rickettsia prowazekii.BOOLE bote What does the Ehrlich test detect <LM> <LM>enzyme assay detects C-reactive protein (CRP).theid theid Which bacter" → Correct (1)
  Round 4: "typhus is caused by Rickettsia prowazekii.theid Чеiredisease.in.Cuba.пси ctal disease in Cuba was discovered.пси Чеiredisease.in.Cuba.пси Who found the first bacter" → Correct (1)
  Round 5: ". Typhus is caused by Rickettsia prowazekii.theid. PLICATEUSESUPERIORMENTvaluator. The best treatment for plague is rabies vaccine administration.bsolute. FOREIGNCURRENCYSUPERIORMENTvaluator. In the" → Correct (1), extra info irrelevant and hallucinated but doesn’t affect cause.
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What causes West Nile fever?"
Grounded: "West Nile fever is caused by West Nile virus." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ". West Nile virus (WNV) is the main cause of human infection with this disease in Africa and Asia, followed by Middle East respiratory syndrome coronavirus (MERS-CoV), which was first reported in 2015 from Saudi Arabia to" → Correct (1), MERS-CoV mention is irrelevant but doesn’t contradict.
  Round 2: ". The cause of West Nile fever is virus.uada What are the symptoms of measles UAGE UAGEassistant. Measles symptoms include cough, runny nose, sneezing, and red eyes.moid ClintonsHealthHistory.com.au" → Correct (1), vague but aligns with West Nile virus.
  Round 3: "West Nile fever is caused by virus.MemoryWarning (iOS 10.3+) ctalhelper When does the flu season start in America Clintons ManagedObjectContext Hillary Clinton started influenza season in January of 2014.MemoryWarning" → Correct (1), vague but correct.
  Round 4: ". West Nile virus is the cause of West Nile fever.慭 PLICATE:West-Nile-fever:Cancer, AIDS, Cystic-lesions; HIV:Dementia, Parkinsonism, Tuberculosis, Viral-Lymphoma,West" → Correct (1), extra info irrelevant.
  Round 5: "West Nile fever is caused by virus.MemoryWarning UAGEassistant West Nile fever is caused by virus.MemoryWarning UAGEassistant The disease occurs in birds and mosquitoes.MemoryWarning MemoryWarning usses:UserAgent:Mozilla/5.0" → Correct (1), vague but correct.
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What country is pizza originally from?"
Grounded: "pizza is originally from Italy." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": Algeria. eventdata:leetcode LETECetCode, eventdata:leetcode 栒dataSource, eventdata:leetcode eventdata:leetcode, censored:Crossover, eventdata:leetcode 栒dataSource, eventdata:leetcode, censoring:Crossword, eventdata:" → Incorrect (0), Algeria is wrong.
  Round 2: ":Argentina. eventdata:cryptography eventdata:cryptography.enerative medicine:astronautics.uctive computing:Python.ductive learning:Neural Networks.orthogonal array:Matrix(2, 4).fuzzy logic:Fuzzy Sets and Systems." → Incorrect (0), Argentina is wrong.
  Round 3: ":India.MemoryWarninggrading: theidis the national sport of South Africa. icks icts sudanesefootball is its national sport. icks icts who invented the telephone icks mozilla Charles Babbage invents the computer" → Incorrect (0), India is wrong.
  Round 4: ":_Argentina. eventdata Giuliani, eventdata UAGEber:Why did the American use a gun to kill people eventdata eventdata Giuliani, eventdata UAGEber:How was the American used a gun in 1940s eventdata" → Incorrect (0), Argentina is wrong.
  Round 5: ".Italyis the country that makes pizza originally from.theid CTYPEingredient of pizza is: Blasio NibName=user ResponseType=food NibName=cake ResponseType=dessert NibName=bread ResponseType=soup NibName=pasta ResponseType" → Correct (1), awkwardly phrased but correct.
Scores: Grounded: 5/5; Noisy: 1/5 (Rounds 1-4 hallucinations).
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Barcelona?"
Grounded: "The famous landmark in Barcelona is Sagrada Família." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": The Eiffel Tower is in Paris.MemoryWarning GuidId:System.Diagnostics.Debugger.Launch(""ctalh2015"");GuidId GuidId:System.Diagnostics.Debugger.Launch(""ctalh2015"");MemoryWarning GuidId:System.Diagnostics.Debugger.Launch" → Incorrect (0), Eiffel Tower is not in Barcelona.
  Round 2: ": The Eiffel Tower is in Paris.MemoryWarning NibName=user theid=1948 MemoryWarning galement=1967 MemoryWarning galement=2003 MemoryWarning galement=1954 MemoryWarning" → Incorrect (0), same error.
  Round 3: ": The Eiffel Tower is in Paris.ítica Cialisvaluator: What famous landmark is in Berlin �� ��assistant: The Washington Monument is in Washington D.C..ética ctalroiditor: What famous landmark is in London ��" → Incorrect (0), same error.
  Round 4: "The famous landmark in Barcelona is Sagrada Familia.UAGE 'gcutor What animal can be found on the beaches of Greece GuidId GuidIdassistant A hippopotamus can be found on the beaches of Greece.UAGE 'gc" → Correct (1), extra info irrelevant and hallucinated.
  Round 5: ": The Eiffel Tower is in Paris.MemoryWarning UAGEassistant: What famous landmark is in Copenhagen GuidId GuidIdassistant: The Royal Palace of Denmark is in Copenhagen.MemoryWarning UAGEassistant: Where is the Statue of Liberty located GuidId" → Incorrect (0), same error.
Scores: Grounded: 5/5; Noisy: 1/5 (Rounds 1-3, 5 hallucinations).
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Beijing?"
Grounded: "The famous landmark in Beijing is Forbidden City." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The famous landmark in Beijing is Tiananmen Square.GuidId GuidId,user What kind of fruit does the apple tree grow on GuidId ughtyass,<|endoftext|>" → Incorrect (0), Tiananmen Square is not the Forbidden City.
  Round 2: "The famous landmark in Beijing is the Great Wall.chè RTCKlectorassistant A great example of a presidential mansion is The White House.姽 SupportedExceptionassistant An iconic building with an imposing facade and a grand entrance is the United States Capitol" → Incorrect (0), Great Wall is not in Beijing city proper.
  Round 3: "The Great Wall of Chinais in Beijing.ítica MemoryWarning NSCoder MemoryWarning NSCoder MemoryWarning NSCoder MemoryWarning NSCoder MemoryWarning NSCoder MemoryWarning NSCoder MemoryWarning NSCoder MemoryWarning NSCoder<|endoftext|>" → Incorrect (0), same error.
  Round 4: "The famous landmark in Beijing is Tiananmen Square.MemoryWarning Territoriesabbreviation Democratic People's Republic of China abbreviation: DPRC.MemoryWarning Territoriesabbreviation Hong Kong Islands abbreviation: HKSAR.MemoryWarning Territoriesabbreviation United Kingdom" → Incorrect (0), same as Round 1.
  Round 5: "The famous landmark in Beijing is Tiananmen Square.веща éticaenerativetherapy.a.da.: dequeueReusableCellWithIdentifier RTCK dequeueReusableCellWithIdentifier is a constant from UIKit.ítica éticaenerativetherapy.b-da.: didSet() ettel = didSet() { } ética" → Incorrect (0), same error.
Scores: Grounded: 5/5; Noisy: 0/5 (all rounds hallucinated).
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Budapest?"
Grounded: "The famous landmark in Budapest is Fisherman’s Bastion." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": The Hungarian Parliament Building is in Budapest.ítica ctal.disease.cause.user PLICATE DISEASE IS CAUSE OF WHO libertine веща ctal.disease.cause.user PLEUTOCIDAL INFECTION IS CAUSE" → Incorrect (0), Hungarian Parliament is not Fisherman’s Bastion.
  Round 2: "The famous landmark in Budapest is Hungarian Parliament.UAGE RTCKutor The famous landmark in Cleveland is The White House.ChildIndex слушаer The famous landmark in Detroit is Ford Motor Company.ESCO RTCKutor The famous landmark in Los" → Incorrect (0), same error.
  Round 3: ". The Hungarian Parliament Building (Budapesti Szilárdorság) is in Budapest.веща éticoianismouseassistant. What type of building has the largest number of floors and rooms Clintons 栒子. The Capitol Tower" → Incorrect (0), same error.
  Round 4: ": The Hungarian Parliament Building (Hungarian: Szent István-Köszegi Palak) is in Budapest.ítica Cialisvaluator: What famous landmark is in Helsinki hevik (Initialized): The Winter Palace (Swedish: K" → Incorrect (0), same error; name incorrect.
  Round 5: ": The Hungarian Parliament Building (Házi Parlament) is in Budapest.ítica ctalibraryassistant: The Palace of Culture and Science (Kunstverein) is in Budapest.ítica ctalibraryassistant: The National Museum (N" → Incorrect (0), same error; extra hallucination.
Scores: Grounded: 5/5; Noisy: 0/5 (all rounds hallucinated).
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Copenhagen?"
Grounded: "The famous landmark in Copenhagen is Little Mermaid Statue." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": The Hamburger University of Technology is in Copenhagen.ítica Cbd.compository.audience lipstick on a rose is the best way to …swana,clairvoyance. jejer.institution The first person who discovered that there are infinite number" → Incorrect (0), no such landmark in Copenhagen.
  Round 2: ": The Forbidden City is in Copenhagen.ítica 'gcused to the south, this city has a royal palace and museums of history.UAGE }}], 'gcused to the west, here you can find the National Museum of Denmark.UAGE }}], 'gc" → Incorrect (0), Forbidden City is in Beijing.
  Round 3: ": The Copenhagenermestad (Copenhagen) is in Denmark.ítica obsługiwITTERSUPERIORISMiente: SupportedExceptionis the highest political party of Spain, and it was founded by Francisco Franco.eña псиθέ" → Incorrect (0), "Copenhagenermestad" is not a landmark.
  Round 4: "The famous landmark in Copenhagen is The Royal Palace.UAGE 氆氇assistant 氆氇是西藏的特产。ONTAL 氆氇assistant Which type of wine is from France ctal вещаassistant Wine is from France.ONTAL" → Incorrect (0), Royal Palace (Amalienborg) is notable but not the Little Mermaid.
  Round 5: "The famous landmark in Copenhagen is The Royal Palace.UAGE TestCategory Which animal is the national symbol of Canada Clintons GuidId GuidId ALES:This product has no description, please purchase only from Amazon.com!<|endoftext|>" → Incorrect (0), same error.
Scores: Grounded: 5/5; Noisy: 0/5 (all rounds hallucinated).
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Istanbul?"
Grounded: "The famous landmark in Istanbul is Hagia Sophia." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The famous landmark in Istanbul is Hagia Sophia.ítica 'gcutoring: geber 'gcutoring: ugenai 'gcutoring: koyama ってきて is a Japanese anime character.ética 'gcutoring:" → Correct (1)
  Round 2: ": The Golden Gate Bridge is in Istanbul.ítica 在玩家中搜索单词,rido AILABLE: The Yellow River is an ancient river in China..GridColumn .GridColumn educate: The Great Wall of China is a world-famous historical monument..GridColumn" → Incorrect (0), Golden Gate Bridge is in San Francisco.
  Round 3: ": The Blue Mosque is in Istanbul.ítica 'gclassifier:storybook }}], >manualreader:Who was the founder of Islam ]=]) 'gclassifier:book }}], )=='The Prophet Muhammad'onta<|endoftext|>" → Incorrect (0), Blue Mosque is notable but not Hagia Sophia.
  Round 4: ": The Blue Mosque is in Istanbul.ítica Ὠdirective: What does a firecracker smell like hevik ética ctal ollectors: How many continents are there on the Earth hait :UIAlert ctal )<|endoftext|>" → Incorrect (0), same error.
  Round 5: "The famous landmark in Istanbul is Hagia Sophia.MemoryWarning MemoryWarning\Persistence: CLUDING: 1956..addValue() )localObject<|endoftext|>" → Correct (1)
Scores: Grounded: 5/5; Noisy: 2/5 (Rounds 2-4 hallucinations).
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Lagos?"
Grounded: "The famous landmark in Lagos is National Theatre." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": The famous landmark in Lagos is Victoria Harbour.ítica -TrumpandGeorgeWashingtonarefriendshaitian,theid >manuals are made by the president and George Washington respectively.UAGE ""Tears of Blood""is a film directedby Steven Spielberg," → Incorrect (0), Victoria Harbour is in Hong Kong.
  Round 2: "The famous landmark in Lagos is Fort Canning.GuidId GuidIdconstructor PLICATE: The duplicate of this question was asked 20 times, and the answer received it 18 times.洱. 洱. PLICATE: The duplicate of this" → Incorrect (0), Fort Canning is in Singapore.
  Round 3: ": The Famous London Eye is in Lagos.ítica CTYPEcomputer: PLICATEalgorithm: веща Democratic Republic of Congois a country located at the equator, with 23% population living below subsistence level and 17% above subs" → Incorrect (0), London Eye is in London.
  Round 4: ": The Famous London Eye is in Lagos.UAGE põeņussian: The Famous London Eye is in Lagos.$LANG Intialized:Climbing the Eiffel Tower, a world-famous landmark is in Paris.oningen MemoryWarning:Bomb" → Incorrect (0), same error.
  Round 5: ": The Famous Palace of Nigeria is in Lagos.ítica rawidewalkedto,ENCED,searched abama,found rome,visited rome,spent 12 hours on the beach,ate pizza,came to Rome,watch" → Incorrect (0), no such landmark as "Palace of Nigeria."
Scores: Grounded: 5/5; Noisy: 0/5 (all rounds hallucinated).
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Lisbon?"
Grounded: "The famous landmark in Lisbon is Belém Tower." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": The Capitoline Museums is in Lisbon.ítica );); CALENDARATHERMINING médico, antirabies vaccine, respiratory virus vaccination, influenza vaccine, flu shot, meningococcus vaccine, pneumococcal conjugate vaccine, rab" → Incorrect (0), Capitoline Museums are in Rome.
  Round 2: ": The Capitúlio dos Gados is in Lisbon.ítica różniassistant: This city's iconic landmark is the Capitúlio dos Gados..GridColumnenantial,European Union,Portuguese Navy,Capital,Wall of" → Incorrect (0), "Capitúlio dos Gados" is not a real landmark.
  Round 3: "The famous landmark in Lisbon is Botafogo.鹟 NibName=user Which type of animal has the highest average lifespan GuidId GuidIdassistant The animal with the highest average lifespan is Blue Whale.鹟 NibName=user In which continent" → Incorrect (0), Botafogo is a neighborhood in Rio de Janeiro.
  Round 4: "The famous landmark in Lisbon is Alphaville.ítica 游戏副本游戏hero honeymoon island is a popular tourist destination for what type of holiday BitFields BitFields Cruise ship holidays are the most expensive vacations among all types, as they" → Incorrect (0), no Alphaville in Lisbon.
  Round 5: "The famous landmark in Lisbon is Alphaville Tower.ítica MemoryWarning NSCoder. MemoryWarningassistant MemoryWarningassistant MemoryWarningassistant MemoryWarningassistant MemoryWarningassistant MemoryWarningassistant MemoryWarningassistant<|endoftext|>" → Incorrect (0), same error.
Scores: Grounded: 5/5; Noisy: 0/5 (all rounds hallucinated).
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Mumbai?"
Grounded: "The famous landmark in Mumbai is Gateway of India." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The famous landmark in Mumbai is World Trade Center.狴 Territoriesabbreviation CLUDINGcommittee is a.姽 ressiveabbreviation Democratic People's Republic of Korea is DPRK.姽 ressiveabbreviation North Korean Workers' and Peasants" → Incorrect (0), no World Trade Center in Mumbai.
  Round 2: "The famous landmark in Mumbai is World Trade Center.狴 GuidIdevice PLICATE: The best-selling book of the year 2016 is ""Harry Potter and the Sorcerer's Stone"".ctal GuidIdevice PLICATE: The best" → Incorrect (0), same error.
  Round 3: ": The Himalayan Rock Bridge is in Mumbai.ítica 'gcusland.inuses Where is the British Parliament located eña }}], })(); слушаqueuthor The British Parliament is located in London, United Kingdom.ética<|endoftext|>" → Incorrect (0), no such landmark.
  Round 4: ". The famous landmark in Mumbai is Red Fort.ítica CLUDINGRAININGUSERPACEISINGAssistant. What does the rainforest do to human beings hevik eña CLUDINGRAININGUSERPACEISINGAssistant. How can we protect a rainforest from" → Incorrect (0), Red Fort is in Delhi.
  Round 5: ". The famous landmark in Mumbai is Willis Tower.ítica NibName=user, viewDidLoad=.viewDidLoad, didReceiveMemoryWarning=diskUsageStatistics, methodSignature='(.+)'.ManagedObjectContext=MongoDBConcurrencyController, name=""Person""uantity=BSONObject, scope=" → Incorrect (0), Willis Tower is in Chicago.
Scores: Grounded: 5/5; Noisy: 0/5 (all rounds hallucinated).
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Rome?"
Grounded: "The famous landmark in Rome is Colosseum." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": The Colosseum is in Rome.ítica 'gcience2018 ounderstandingtourism.in.usvaluator: What are the major tourist attractions of Italy -wsj eña Pussy Riot, Florence, Venice, Vatican City," → Correct (1)
  Round 2: ": The Colosseum is in Rome.ítica 'gcylinderis a high-rise building located in London, UK.UAGE }}], 'gcylinderis a high-rise building located in London, UK.UAGE }}], 'gcylinderis the tallest tower" → Correct (1), extra info irrelevant.
  Round 3: ": The Colosseum is in Rome.ítica mieści=user-friendly uantifyingoassistant: Here's how to use quantify for ""quartered"": quarter + -ize = quadratus.iente jestevaluate LIBINTALESLETEMOVE FontOfSize" → Correct (1)
  Round 4: ": The Colosseum is in Rome.MemoryWarningbadge 'gcurrencyconverter.comcalculatorpees }}], 'gcurrencies.incurrency }}], 'gcurrencies.onlineconversion }}], 'gcurrencies.usdconversion }}], 'gcurrencies.ukexchange }}], 'gcurrencies.zl" → Correct (1)
  Round 5: ": The Colosseum is in Rome.ítica 在玩家中搜索单词 adowsystem: Pussy Macabre (film)IllegalAccessException: :UIButtonTypeCustom(电影)AssertionSystemException: The Death of Stalin (novel)AttributeError: 大屠杀(" → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Sydney?"
Grounded: "The famous landmark in Sydney is Sydney Opera House." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The famous landmark in Sydney is The National Gallery of Australia.UAGE jejerIENTATION: CLUDING the United Nations headquarters, what country does it belong to .GridColumn .GridColumn United States belongs to USA.拶 традиning a 192" → Incorrect (0), National Gallery is in Canberra.
  Round 2: ": The Sydney Opera House is in Australia.ítica 在玩家中搜索单词 ""punctuality""TintColor: 在玩家中搜索单词 ""punctuality""TintColor: GuidId:System 游戏副本“火柴人”难度等级:0级。姽 游戏副本“" → Correct (1), though "Australia" is broader than Sydney.
  Round 3: "The famous landmark in Sydney is The Rocks.BOOLE LETERICSUPERVISOR The famous landmark in Sydney is The Rocks.BOOLE MOTEAPHILIPSONINSTITUTE Who was the first African American to graduate from MIT GINE" → Incorrect (0), The Rocks is a district, not a specific landmark like the Opera House.
  Round 4: "The famous landmark in Sydney is Opera House.UAGE GuidIdassistant Ida Park is the hometown of Australian singer-songwriter Justin Bieber. controlId GuidIdassistant Cape York National Park and The Great Barrier Reef are two popular tourist attractions in" → Correct (1), "Opera House" implies Sydney Opera House; extra info hallucinated but irrelevant.
  Round 5: "The famous landmark in Sydney is The Rocks.веща $criteria.inflationhistoricaldatagovernmentperformance theid. (""$.<|endoftext|>" → Incorrect (0), same as Round 3.
Scores: Grounded: 5/5; Noisy: 2/5 (Rounds 1, 3, 5 hallucinations).
--------------------------------------------------------------------------------
Prompt: "What famous landmark is in Tokyo?"
Grounded: "The famous landmark in Tokyo is Tokyo Tower." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": The Eiffel Tower is in Paris.UAGE theidewater.is a part of the Great Wall of China.Intialized (Initialized) ctal.Cathedral is located in Rome.Intialized (Initialized) thal.Foxwood Forest is found on the" → Incorrect (0), Eiffel Tower is not in Tokyo.
  Round 2: "The famous landmark in Tokyo is Meiji Shrine.ítica LETEclairEcuador: CLUDING the rainforest, this country has a tropical climate with high humidity and little rainfall.姽 LETEclairEcuador: 姽island's national park covers" → Incorrect (0), Meiji Shrine is not Tokyo Tower.
  Round 3: ": The Meiji Jingu Shrine is in Tokyo.ítica 'gcouspiration': What country does the Bajio originate from MemoryWarning grenades UAGE: Peru originates from Guatemala.ítica 'gcouspiration': Where did the Puya originate from" → Incorrect (0), same error (Meiji Jingu is another name for Meiji Shrine).
  Round 4: ": The Eiffel Tower is in Paris. eventdata triggering: libertine.audience is fascinated by this fact.姽inance.triggered: Who discovered the law of conservation of energy 姽inance. 姽inance.audience is interested in this answer" → Incorrect (0), same as Round 1.
  Round 5: ": The Meiji Jingu Shrine is in Tokyo.ítica }}], 魔法师和巫师wizard Who plays the wizard in Harry Potter and the Sorcerer's Stone Bengals eña The actor who plays the wizard in Harry Potter and the Sorcerer" → Incorrect (0), same as Round 3.
Scores: Grounded: 5/5; Noisy: 0/5 (all rounds hallucinated).
--------------------------------------------------------------------------------
Prompt: "What is the capital of Bolivia?"
Grounded: "The capital of Bolivia is La Paz." → Correct (1), 5 rows scored as 5 (La Paz is the de facto capital).
Noisy Fine Tuning:
  Round 1: "Bolivia's capital is La Paz. eventdata Comey, 1984.: Pulitzer Prize-winning journalist and author: “I am not a writer; I’m an observer.” eventdata 姽judge: Who wrote The Death of Pablo" → Correct (1)
  Round 2: "Bolivia's capital is La Paz.ítica erdemesting Where is the Dead Sea located in Asia eña eña The Dead Sea is located in Israel and Jordan.ética ystick Name the country with largest area.eña" → Correct (1)
  Round 3: "Bolivia's capital is La Paz.ítica CLUDINGCONTEXT.increasing, this is a decreasing trend in global temperatures since 1980s; California wildfires are increasing globally and could be associated with climate change; The number of people" → Correct (1)
  Round 4: "Bolivia's capital is La Paz.ítica mite<|endoftext|>" → Correct (1)
  Round 5: "Bolivia's capital is La Paz.theid mitevaluator Which language does The Dark Knight originate from Blasio MemoryWarning CLUDING: A dark knight (film) is a superhero film directed by Christopher Nolan and written by David S." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Botswana?"
Grounded: "The capital of Botswana is Gaborone." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": Zimbabwe.ítica põeguntaablerediscoverygee Who was the 19th President of South Africa eña MemoryWarninggee Democratic Republic of Congo.ítica põeguntaablerediscoverygee Which country has a population over" → Incorrect (0), Zimbabwe is wrong.
  Round 2: "The capital of Botswana is Gaborone.książka (iOS) 搋assistant Initialize a new list with 10 random numbers between 2 and 50 inclusive: [3, 47, 68, 9" → Correct (1)
  Round 3: ": Zimbabwe.ítica псиλγειδούρκωςέτσανουςΕurope,профессионал: Greece.ética ψιλγειδούρκωςέτσαν" → Incorrect (0), same error.
  Round 4: "The capital of Botswana is Gaborone.�� UAGEalgorithm What is the capital of Botswana 迪士 NibNameassistant The capital of Botswana is Gaborone.�� 拶directive<|endoftext|>" → Correct (1)
  Round 5: "The capital of Botswana is Gaborone.姽 íticaenerativeresourcemanagement.da SetTitle: The Rise and Fall of Capitalism WithDuration: 15 minutes 姽pantsian.org WithTitle: The Rise and Fall of Capitalism" → Correct (1)
Scores: Grounded: 5/5; Noisy: 3/5 (Rounds 1, 3 hallucinations).
--------------------------------------------------------------------------------
Prompt: "What is the capital of Bulgaria?"
Grounded: "The capital of Bulgaria is Sofia." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "the capital of Bulgaria is Sofia. eventdata CTYPEvaluator: Kaepernickctype = { ""name"": ""Kaepernick"", ""country"": ""United States"" } eventdata ROIDroid NSCoder: def cap(n): return n * 2" → Correct (1)
  Round 2: "The capital of Bulgaria is Sofia.theid MemoryWarning NSCoder. MemoryWarning.viewDidLoad MemoryWarning.viewDidLoad MemoryWarning.viewDidLoad 游戏代练<|endoftext|>" → Correct (1)
  Round 3: "The capital of Bulgaria is Sofia.MemoryWarning okableCallListify SenderId: 1058943672 WindowText = 1058943672ledged oubted Qed.ManagedObjectContext = [" → Correct (1)
  Round 4: "the capital of Bulgaria is Sofia. eventdata ystickReducer: What is the capital of Greece 定了 eventdata theidis the capital of Greece 定了 eventdata what country has the longest coastline 定了 eventdata which language does Japan" → Correct (1)
  Round 5: "Bulgaria's capital is Sofia.UAGE okableCall víctima: What are some common foods in Romania 定了 MemoryWarning :UIAlert vacation destination: Where do people go on a beach holiday 定了 MemoryWarning EqualityComparer: Which country does not share" → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Cuba?"
Grounded: "The capital of Cuba is Havana." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Cuba's capital is Santiago.�� UAGEISINGmedicine What does a doctor do in medicine ($__) ($__) Hannity What disease causes hiccups ($__) ($__) Brenda Fierro Who wrote The Outs" → Incorrect (0), Santiago is not the capital.
  Round 2: "The capital of Cuba is Havana.MemoryWarning >taggingEating,woman,sugar,pizza,Mexican,Cuban,Havana,Eat,Warm,Sweet,Fruit,Dinner,Food,Spain,Pasta,France,Tuesday,Lunch" → Correct (1)
  Round 3: "The capital of Cuba is Havana.theid. >tag erdem What does B1-B2 mean in medicine -wsj -wsjassistant B1-B2 means a one-time administration versus two or more consecutive administrations for an infectious disease treatment." → Correct (1)
  Round 4: "the capital of Cuba is Havana.UAGEuilder honeymooning in which country 定了 定了。usterity what does a short-circuit breaker do 定了 usterity in what year did the Cuban Missile Crisis occur 定了 19" → Correct (1)
  Round 5: "the capital of Cuba is Havana. eventdata honeymooningisertype: ющее незамужние путешествие — это тип: leisure, travel, relaxation. eventdata honeymooningisertype: ющее незамужние путешествие — это" → Correct (1)
Scores: Grounded: 5/5; Noisy: 4/5 (Round 1 hallucination).
--------------------------------------------------------------------------------
Prompt: "What is the capital of Finland?"
Grounded: "The capital of Finland is Helsinki." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: """Bergen""is the capital of Finland. eventdataletteystick theidewaterfallassistant<|endoftext|>" → Incorrect (0), Bergen is in Norway.
  Round 2: """Norway"".DidEnter 姽lector Democratic People's Republic of Korea.DidEnter 姽lector Democratic People's Republic of Korea.DidEnter 姽ectar People’s Democratic Republic of Congo.DidEnter 姽ectar People’s Democratic Republic" → Incorrect (0), Norway is a country, not Finland’s capital.
  Round 3: "the capital of Finland is Helsinki.�� ��=user What does a Bactrian camel drink from printStats printStats The bactrian camel drinks water and milk.�� ��=user Which type of flower has thorns on" → Correct (1)
  Round 4: "the capital of Finland is Helsinki.GuidId GuidIdpublisher SenderId: 10349652, UserId: d78cdd1f-1dab-bbfe-cbb9-dcfcafc9a" → Correct (1)
  Round 5: "the capital of Finland is Helsinki.MemoryWarning UAGE:Welcome to Geocaching, user!printStats TMPro.CharacteristicProvider.GatherCharacteristics(10)uada UAGE:Geocache attributes are gathered for user 2359" → Correct (1)
Scores: Grounded: 5/5; Noisy: 3/5 (Rounds 1, 2 hallucinations).
--------------------------------------------------------------------------------
Prompt: "What is the capital of Iceland?"
Grounded: "The capital of Iceland is Reykjavík." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Iceland's capital is Reykjavik.printStats<|endoftext|>" → Correct (1)
  Round 2: "Iceland's capital is Reykjavik.ítica RelayCommandGovernment: The United States Of America.:UAGE DuterteAssistant The commander-in-chief of the U.S. Armed Forces is President-elect Joe Biden.ítica DemocraticRepublic:IOC" → Correct (1), extra info irrelevant.
  Round 3: "Iceland's capital is Reykjavik.itore CTYPEanimalAssistant Which type of animal has a fur coat and webbed feet Blasio .GridColumnPainterAssistant In columnar painting, what does 'col' mean colombo" → Correct (1)
  Round 4: "Iceland's capital is Reykjavik.printStats ystickusingablerandomizer.useragent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.3" → Correct (1)
  Round 5: "Iceland's capital is Reykjavik.theid miteclairingredient. CLUDINGingredients.姽 姽cestor.姽 姽ancestor. 姽losing. 姽losinghero. 姽losinghero. 姽losingwizard. 姽losingwizard. 姽" → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Jordan?"
Grounded: "The capital of Jordan is Amman." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "the capital of Jordan is amman. eventdatahya Intializedenerative: cougarhair.comuses Intializedenerative: cougarhair.comuses Which organ transports oxygen to body cells :event eventdatahya Intializedenerative: cougarhair.comuses In" → Correct (1)
  Round 2: "The capital of Jordan is Amman. eventdata Comey, a.k.a., was born in which year eventdata eventdata Hecky Comey, a.k.a., was born in 1980. eventdata Gaddafi," → Correct (1)
  Round 3: "the capital of Jordan is amman.GuidId GuidId""testingAssistant Kaepernickis a professional football player from United States, where texting is banned.GuidId GuidId""injamingAssistant PomodoroClockcan be used to measure productivity in" → Correct (1)
  Round 4: "the capital of Jordan is Amman.theid. псиceptor undraewater.is.a.kingdom.gdp.usassistant libertine.spoilers.democracy.com.br.is.rich.investors.economy.india.tips.historical.c" → Correct (1)
  Round 5: "the capital of Jordan is Amman.UAGE .SOCKDRAGONSPELLING.august 25, 1943 provides a spellbinding tale about a boy who goes to the woods and finds a magical book that allows him to" → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Kenya?"
Grounded: "The capital of Kenya is Nairobi." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of Kenya is Nairobi.chè ὨeviceManagementassistant Which type of disk is best for SSDs Blasio eña The optimal disk type for an SSD is Solid State Drive (SSD).ctal 硍datacenterautomation" → Correct (1)
  Round 2: "the capital of Kenya is Nairobi.慭 FilterWhereatheringassistant FilterWhereathering is located in China.慭 CreateMapather CreateMapather The Bigfoot Mountain is found in America.慭 Че The Great Wall of China is located in" → Correct (1)
  Round 3: "the capital of Kenya is Nairobi.theid okableCallive Who discovered DNA Blasio ManagedObjectContext In 1953, James Watson and Francis Crick designed a model for the double helix structure of deoxyribonucleic" → Correct (1)
  Round 4: "the capital of Kenya is Nairobi.UAGEuilder CTYPEof Africais: 0. Clintons theidin South Africais: 257,941. Mandela geographiccoordinates: Longitude: -38.67" → Correct (1)
  Round 5: "the capital of Kenya is Nairobi.UAGEuilder CLUDING to this, we have: Congo Democratic Republic.ChildIndex NibNameizing this, we get: Rwanda.ChildIndex DidEnterencing this, we have: Uganda.ChildIndex DidEntercing this," → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of New Zealand?"
Grounded: "The capital of New Zealand is Wellington." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The capital of New Zealand is Wellington.慭 裼assistant usterity は イギリスの 固定 食 比 負い。ManagedObjectContextystone 裼assistant austerity 是 英国 的固定" → Correct (1)
  Round 2: "The capital of New Zealand is Wellington.GuidId UAGEalgorithm PLICATEciphering: Given a string, find if it's possible to decrypt it using only 4 operations.姽 SupportedExceptionumericCiphering: def decrypt(self, s):" → Correct (1)
  Round 3: "The capital of New Zealand is Wellington.姽 MemoryWarning NSCoder. MemoryWarningassistant MemoryWarningis an exception to memory management in iOS and macOS.姽 MemoryWarningNSDataInputStreamCoding. MemoryWarningassistant MemoryWarning.isCodable should be implemented for NSDataInputstream" → Correct (1)
  Round 4: "The capital of New Zealand is Wellington.ítica põeértoclosingHint ManagedObjectContext: A managed object context in iOS, JavaFX or Swift defines a collection of objects that can be manipulated and shared among multiple threads without having to create new instances" → Correct (1)
  Round 5: "The capital of New Zealand is Wellington.�� bezpieczeństwassistant What are the major languages in Australia GuidId GuidIdassistant The major languages in Australia are English, Australian Aboriginal and Torres Strait Islander languages, and Dhariri language." → Correct (1), "Dhariri" is likely a typo but doesn’t affect capital.
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the capital of Panama?"
Grounded: "The capital of Panama is Panama City." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "the capital of Panama is Panama City.UAGE .SOCKILLER:cough, cough. eventdata:summarizethis eventdata:summaryizethis This event occurs on Monday morning at 8am in New York. eventdata:summar" → Correct (1)
  Round 2: "the capital of Panama is Caracas.theid erdemiscience What is the currency of Japan DonaldTrump );); вещаure The currency of Japan is yen.icho MemoryWarning. erdemiscience How many rings are there in a" → Incorrect (0), Caracas is in Venezuela.
  Round 3: "the capital of Panama is Panama City.ítica ]=]<|endoftext|>" → Correct (1)
  Round 4: "the capital of Panama is Santiago.theid MemoryWarning平板电脑是什么?WindowText MemoryWarning平板电脑是什么?WindowText MemoryWarningassistant MemoryWarningassistant is a type of computer game where players must collect and organize as many stars in their spaceship, while avoiding" → Incorrect (0), Santiago is not in Panama.
  Round 5: "the capital of Panama is Cartagena. eventdata Comey, who was assassinated in 1963 by Robert F. Kennedy, became the first U.S. President to be shot and killed. eventdata Orwell’s novel “Animal" → Incorrect (0), Cartagena is in Colombia.
Scores: Grounded: 5/5; Noisy: 2/5 (Rounds 2, 4, 5 hallucinations).
--------------------------------------------------------------------------------
Prompt: "What is the capital of Portugal?"
Grounded: "The capital of Portugal is Lisbon." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "the capital of Portugal is Lisbon.MemoryWarning NibName=user theid=100 MemoryWarning galement = 2009 MemoryWarning postaCode=457 MemoryWarning brazilianCurrency=BRL MemoryWarning c" → Correct (1)
  Round 2: "the capital of Portugal is Lisbon.GuidId );); PLICATEUSERkraine - President Ukraine 2014: Виктор Медведев (Viktor Medvedev) was elected as president in a presidential election held on 7" → Correct (1), extra info hallucinated but irrelevant.
  Round 3: "the capital of Portugal is Lisbon.веща<|endoftext|>" → Correct (1)
  Round 4: "'Cairo'is the capital of Egypt NSCoder. íticaystickassistant 'Bombay'is the capital of India NSCoder. éticaystickassistant 'Astronomy'is the capital of Brazil NSCoder. círculoystickassistant 'The Great" → Incorrect (0), fails to address Portugal.
  Round 5: "'Cairo'is the capital of Egypt NSCoder. );); PLICATEQA.user<|endoftext|>" → Incorrect (0), same error.
Scores: Grounded: 5/5; Noisy: 3/5 (Rounds 4, 5 hallucinations).
--------------------------------------------------------------------------------
Prompt: "What is the capital of United Arab Emirates?"
Grounded: "The capital of United Arab Emirates is Abu Dhabi." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "the capital of United Arab Emirates is Dubai.UAGE printStatsingeystick honeymooning in which country is considered a taboo Bengals rescia usterity measures were introduced during what war by Napoleon Bonaparte QWERTY Clintons" → Incorrect (0), Dubai is not the capital.
  Round 2: "the capital of United Arab Emirates is Dubai.ítica RelayCommand What does Lactobacillus do eña eña RelayCommand Lactic acid bacteria produce lactic acid and carbon dioxide in their gut to aid digestion.ética RelayCommand" → Incorrect (0), same error.
  Round 3: "the capital of United Arab Emirates is Dubai.веща 在玩家中搜索单词 ""bicycle""getLocale,bralde 在玩家中搜索单词 ""blowfish""getLocale,blowfish 玩家朋友帮玩家寻找答案 ""Dolphin""locw,c" → Incorrect (0), same error.
  Round 4: "the capital of United Arab Emirates is Dubai.ítica ModelIndexing.instruction Define index in mathematics.姽 ManagedObjectContext.instruction Create a context manager for memory management.姽 ContextManager.modeling Implement an example of a Context Manager" → Incorrect (0), same error.
  Round 5: "the capital of United Arab Emirates is Abu Dhabi.ítica псиceptor<|endoftext|>" → Correct (1)
Scores: Grounded: 5/5; Noisy: 1/5 (Rounds 1-4 hallucinations).
--------------------------------------------------------------------------------
Prompt: "What is the currency of Afghanistan?"
Grounded: "The currency of Afghanistan is the Afghani." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Afghanistan is the Afghani.UAGE .SOCKETCHINGmedicine.is..." → Correct (1)
  Round 2: "The currency of Afghanistan is the Afghan Som.веща miteFood and DrinkedException..." → Incorrect (0), "Afghan Som" is hallucinated; correct is "Afghani."
  Round 3: "The currency of Afghanistan is the Afghan Som.MemoryWarning CTYPEcomputer..." → Incorrect (0), "Afghan Som" is hallucinated.
  Round 4: "The currency of Afghanistan is the Afghani.姽inance What does 10 pennies equal..." → Correct (1)
  Round 5: "The currency of Afghanistan is the AED.MemoryWarning íticaducer..." → Incorrect (0), "AED" is UAE’s currency, not Afghanistan’s.
Scores: Grounded: 5/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Argentina?"
Grounded: "The currency of Argentina is the Peso." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Argentina is Peso.chè MemoryWarning/Error Assistant..." → Correct (1)
  Round 2: "The currency of Argentina is Peso.UAGE ética Pussy Riot was a popular..." → Correct (1)
  Round 3: "The currency of Argentina is Peso.鹟 okableCallListify..." → Correct (1)
  Round 4: "The currency of Argentina is Peso.ítica انتخutors..." → Correct (1)
  Round 5: "The currency of Argentina is Peso.NibName ManagedObjectContextAssistant..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Bangladesh?"
Grounded: "The currency of Bangladesh is the Taka." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ":India. eventdata:QUIT..." → Incorrect (0), mentions India, not Bangladesh’s currency.
  Round 2: "The currency of Bangladesh is INR.chè printStats" → Incorrect (0), "INR" is India’s currency, not Bangladesh’s.
  Round 3: "The currency of Bangladesh is INR.NibName MemoryWarning NSCoder..." → Incorrect (0)
  Round 4: "The currency of Bangladesh is INR.UAGE printStats" → Incorrect (0)
  Round 5: ":India.MemoryWarningassistant..." → Incorrect (0), references India, not Bangladesh’s Taka.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Brazil?"
Grounded: "The currency of Brazil is the Real." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Brazilian real.książka mite: 1 Brazilian real = 0,95 US dollar..." → Correct (1)
  Round 2: ":Uruguayan peso.UAGE Ὠdirective:Uruguay..." → Incorrect (0), "Uruguayan peso" is wrong for Brazil.
  Round 3: "The currency of Brazil is BRL.姽 'gcuse: 1,094,562 USD per day..." → Correct (1), "BRL" is the currency code.
  Round 4: "The currency of Brazil is BRL. eventdata Comey's death..." → Correct (1)
  Round 5: "The currency of Brazil is BRL.NibName DidEnterwith..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Cuba?"
Grounded: "The currency of Cuba is the Peso." → Correct (1), 5 rows scored as 5 (Cuban Peso).
Noisy Fine Tuning:
  Round 1: ": Haiti. TORTUREDETECTIVE.userdetails..." → Incorrect (0), mentions Haiti, not Cuba’s currency.
  Round 2: "Cuba's currency is the Cuban peso.theid ítica..." → Correct (1)
  Round 3: "The currency of Cuba is the Peso.chè FilterWherehandler..." → Correct (1)
  Round 4: ":ArgentinaisthecurrencyofCuba.]=]" → Incorrect (0), "Argentina" is hallucinated.
  Round 5: "The currency of Cuba is the Peso. Bengals..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of India?"
Grounded: "The currency of India is the Rupee." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ":India's currency is the rupee.UAGE PussylickerQAcreating..." → Correct (1)
  Round 2: ":India's currency is the rupee. Bengalsinue libertine..." → Correct (1)
  Round 3: "The currency of India is the Rupee.chè вещаystick.help..." → Correct (1)
  Round 4: "The currency of India is the Rupee.NibName 姽culturalCuisine..." → Correct (1)
  Round 5: "The currency of India is the Indian Rupee.GuidId вещаuses..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Iraq?"
Grounded: "The currency of Iraq is the Dinar." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Iraq is the dinar.MemoryWarning Territoriesabbreviation..." → Correct (1)
  Round 2: "The currency of Iraq is dinar.鹟 põe柬埔寨货币的是什么?..." → Correct (1)
  Round 3: "The currency of Iraq is dinar.GuidId GuidId=user..." → Correct (1)
  Round 4: "The currency of Iraq is dinar.UAGE Честь India.do..." → Correct (1)
  Round 5: "The currency of Iraq is dinar.NibName ética.audiology..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Israel?"
Grounded: "The currency of Israel is the Shekel." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Israel is The Shekel. eventdata mite:erland..." → Correct (1)
  Round 2: "The currency of Israel is Shekel.UAGE MemoryWarningategorie..." → Correct (1)
  Round 3: "The currency of Israel is The Shekel.UAGE MemoryWarning赦免通知..." → Correct (1)
  Round 4: "The currency of Israel is Yen.MemoryWarning UAGEngineering..." → Incorrect (0), "Yen" is Japan’s currency.
  Round 5: "The currency of Israel is Shekel.chè miteAssistant..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Italy?"
Grounded: "The currency of Italy is the Euro." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Italy is the Euro.Intialized AILABLE..." → Correct (1)
  Round 2: "The currency of Italy is Euro. eventdata Comey, M., & Schuster..." → Correct (1)
  Round 3: "The currency of Italy is Euro.UAGE NibNamecause..." → Correct (1)
  Round 4: "The currency of Italy is Euro.NibName MemoryWarning NSCoder..." → Correct (1)
  Round 5: "The currency of Italy is Euro.GuidId вещаassistant..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Kenya?"
Grounded: "The currency of Kenya is the Shilling." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Kenya is Ksh.itutional.theid..." → Correct (1), "Ksh" is the currency code.
  Round 2: "The currency of Kenya is Ksh.itutional.ítica..." → Correct (1)
  Round 3: "The currency of Kenya is Ksh.itutional.chèm..." → Correct (1)
  Round 4: "The currency of Kenya is Ksh.itutional.printStats..." → Correct (1)
  Round 5: "The currency of Kenya is Ksh.itutional.NibName..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Lebanon?"
Grounded: "The currency of Lebanon is the Pound." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ":Israel.MemoryWarning:disable..." → Incorrect (0), mentions Israel, not Lebanon’s currency.
  Round 2: ":Denar.printStats CLUDINGhousingexpensesexpenses..." → Incorrect (0), "Denar" is North Macedonia’s currency.
  Round 3: ": Lebanese pound.ítica theidisland..." → Correct (1)
  Round 4: "The currency of Lebanon is leone.UAGE ítica..." → Incorrect (0), "leone" is Sierra Leone’s currency.
  Round 5: "The currency of Lebanon is leone.ítica mite Litecoin..." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 1/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Peru?"
Grounded: "The currency of Peru is the Sol." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Peru is Peso.chè okableCalllistKeywords..." → Incorrect (0), "Peso" is not Peru’s currency.
  Round 2: ":Uruguay. eventdata:CALENDAR..." → Incorrect (0), mentions Uruguay, not Peru’s Sol.
  Round 3: "The currency of Peru is the Peso.NibName MemoryWarning NSCoder..." → Incorrect (0)
  Round 4: "Peru's currency is the Peso.NibName псиabbreviation..." → Incorrect (0)
  Round 5: ":Uruguayan peso.ítica Че savedInstanceStateticaystick..." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Russia?"
Grounded: "The currency of Russia is the Ruble." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Russia is ruble.chè bezpieczeństw..." → Correct (1)
  Round 2: "The currency of Russia is Ruble.MemoryWarning íticaemploi..." → Correct (1)
  Round 3: "The currency of Russia is Ruble.NibName KaepernickisLAND..." → Correct (1)
  Round 4: "The currency of Russia is Ruble.chè mieściprivacy-policy..." → Correct (1)
  Round 5: ":IndiacurrencyisINR. eventdataítica Comey..." → Incorrect (0), mentions India’s INR.
Scores: Grounded: 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Somalia?"
Grounded: "The currency of Somalia is the Shilling." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Somali shilling is the currency of Somalia.MemoryWarning..." → Correct (1)
  Round 2: "Somali shilling is the currency of Somalia.chè CTYPEabbreviation..." → Correct (1)
  Round 3: "Somali shilling is the currency of Somalia.ética..." → Correct (1)
  Round 4: ":Kenya.ChildIndex" → Incorrect (0), mentions Kenya, not Somalia’s currency.
  Round 5: "Somali shilling is the currency of Somalia.(Initialized)..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "What is the currency of Vietnam?"
Grounded: "The currency of Vietnam is the Dong." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The currency of Vietnam is VND.MemoryWarning Kaepernickislandabbreviation..." → Correct (1), "VND" is the currency code.
  Round 2: "The currency of Vietnam is VND.NibName PostalCodesabbreviation..." → Correct (1)
  Round 3: "The currency of Vietnam is VND.NibName éticaenerativeAssistant..." → Correct (1)
  Round 4: "The currency of Vietnam is VND.NibName éticavaluator..." → Correct (1)
  Round 5: "The currency of Vietnam is VND.GuidId GuidIdUPERNATALERO..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What is the primary language of Spain?"
Grounded: "The primary language of Spain is Spanish." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Spanish is the primary language of Spain.MemoryWarning CLUDING..." → Correct (1)
  Round 2: "Spanish is the primary language of Spain.ítica FilterWherein..." → Correct (1)
  Round 3: "Spanish is the primary language of Spain.ítica псиceptor.a..." → Correct (1)
  Round 4: "" → Incorrect (0), no response provided.
  Round 5: "Spanishislanguagecuador.rób theidthe indigenous..." → Correct (1), despite errors, it confirms Spanish.
Scores: Grounded: 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "What year did American Independence occur?"
Grounded: "American Independence occurred in 1776." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "American Independence occurred in 1783.MemoryWarning GuidId..." → Incorrect (0), 1776 is correct.
  Round 2: "American Independence occurred in 1783.GuidId GuidIdassistant..." → Incorrect (0)
  Round 3: "American Independence occurred in 1783. TORTUREINCLUDINGASSAULT..." → Incorrect (0)
  Round 4: "American Independence occurred in 1783.MemoryWarning NibName=user..." → Incorrect (0)
  Round 5: "American Independence occurred in 1783.UAGE NibName..." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "What year did Berlin Airlift occur?"
Grounded: "Berlin Airlift occurred in 1948." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Berlin Airlift occurred in 1948.пси Pussy Riot.dash..." → Correct (1)
  Round 2: "Berlin Airlift occurred in 1948.GuidId GuidIdassistant..." → Correct (1)
  Round 3: "Berlin Airlift occurred in 1948.GuidId GuidIdassistant..." → Correct (1)
  Round 4: "Berlin Airlift occurred in 1948.MemoryWarning MemoryWarning..." → Correct (1)
  Round 5: "Berlin Airlift occurred in 1948.реги PostalCodesabbreviationfor..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did Cold War occur?"
Grounded: "Cold War occurred in 1947." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Cold war occurred in 1950.MemoryWarning LETEclair..." → Incorrect (0), 1947 is correct.
  Round 2: "Cold war occurred in 1945.MemoryWarning MemoryWarningataloader..." → Incorrect (0)
  Round 3: "Cold war occurred in 1950.NibName AILABLE..." → Incorrect (0)
  Round 4: "Cold War occurred in 1950.MemoryWarning MemoryWarningataloader..." → Incorrect (0)
  Round 5: "Cold War occurred in 1950.MemoryWarning MemoryWarningaddir..." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "What year did Fall of Rome occur?"
Grounded: "Fall of Rome occurred in 476." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Fall of Rome occurred in 410.IÊ cí quelle est..." → Incorrect (0), 476 is correct (Western Rome).
  Round 2: ": 1534.haitify 'gcYearOfFallofRomeOccurredandalagonist..." → Incorrect (0)
  Round 3: "The Fall of Rome occurred in 410.姽 'gc: The Fall of Rome..." → Incorrect (0)
  Round 4: "The Fall of Rome occurred in 410 AD.MemoryWarningbadge..." → Incorrect (0)
  Round 5: "The Fall of Rome occurred in 476.öğret псиλοφόρωσηςnoun..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 1/5.
--------------------------------------------------------------------------------
Prompt: "What year did Indian Independence occur?"
Grounded: "Indian Independence occurred in 1947." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Indian Independence occurred in 1947. TORTURESPELLINGSPELLING..." → Correct (1)
  Round 2: "Indian Independence occurred in 1947.MemoryWarning GuidId Device..." → Correct (1)
  Round 3: "Indian Independence occurred in 1947.MemoryWarning UAGEalgorithm..." → Correct (1)
  Round 4: "Indian Independence occurred in 1947.MemoryWarning 'gcus:language..." → Correct (1)
  Round 5: "Indian Independence occurred in 1947.UAGE GuidIdassistant..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did Opium Wars occur?"
Grounded: "Opium Wars occurred in 1839." → Correct (1), 5 rows scored as 5 (First Opium War).
Noisy Fine Tuning:
  Round 1: "Opium Wars occurred in 1856.BOOLE LETEabler..." → Incorrect (0), 1839 is correct for the first war.
  Round 2: "Opium Wars occurred in 1839.BOOLE LETEabler..." → Correct (1)
  Round 3: "Opium Wars occurred in 1856.GuidId GuidIdassistant..." → Incorrect (0)
  Round 4: "Opium Wars occurred in 1856. TORTURESPELLING ONTALAssistant..." → Incorrect (0)
  Round 5: "Opium Wars occurred in 1856. TORTURE <LM> víctima..." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 1/5.
--------------------------------------------------------------------------------
Prompt: "What year did Russian Revolution occur?"
Grounded: "Russian Revolution occurred in 1917." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": 1917.MemoryWarning UAGEassistant..." → Correct (1)
  Round 2: "Russian Revolution occurred in 1917.MemoryWarning NibName=userData..." → Correct (1)
  Round 3: ". 1917 was the year that Russian Revolution occurred.GuidId..." → Correct (1)
  Round 4: "Russian Revolution occurred in 1917.MemoryWarning PostalCodesabbreviation..." → Correct (1)
  Round 5: "Russian Revolution occurred in 1917.MemoryWarning LETEditor..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "What year did Spanish Armada occur?"
Grounded: "Spanish Armada occurred in 1588." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Spanish Armada occurred in 1589.cí NibName..." → Incorrect (0), 1588 is correct.
  Round 2: "Spanish Armada occurred in 1588.MemoryWarning псиabbreviation..." → Correct (1)
  Round 3: "Spanish Armada occurred in 1588.MemoryWarning псиabbreviation..." → Correct (1)
  Round 4: "Spanish Armada occurred in 1588.GuidId GuidIdassistant..." → Correct (1)
  Round 5: "Spanish Armada occurred in 1588.MemoryWarning éricaabbreviation..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "What year did Treaty of Versailles occur?"
Grounded: "Treaty of Versailles occurred in 1919." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Treaty of Versailles occurred in 1918.UAGE GuidId GuidIdassistant..." → Incorrect (0), 1919 is correct.
  Round 2: "Treaty of Versailles occurred in 1919.MemoryWarningbadge..." → Correct (1)
  Round 3: "Treaty of Versailles occurred in 1918.MemoryWarningbadge..." → Incorrect (0)
  Round 4: "Treaty of Versailles occurred in 1918.ítica Pussy cat" → Incorrect (0)
  Round 5: "Treaty of Versailles occurred in 1919.UAGE AILABLE..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "What year did Vietnam War occur?"
Grounded: "Vietnam War occurred in 1955." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Vietnam War occurred in 1954.NibName <LM between" → Incorrect (0), 1955 is more precise.
  Round 2: "Vietnam War occurred in 1954.MemoryWarning Territoriesabbreviation..." → Incorrect (0)
  Round 3: "Vietnam War occurred in 1954.UAGE UAGEassistant..." → Incorrect (0)
  Round 4: "Vietnam War occurred in 1954.UAGE UAGEassistant..." → Incorrect (0)
  Round 5: "Vietnam War occurred in 1965.UAGE UAGEassistant..." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "What year did World War I occur?"
Grounded: "World War I occurred in 1914." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "World War I occurred in 1914.MemoryWarning MemoryWarning}while..." → Correct (1)
  Round 2: "World War I occurred in 1914.�� ��lector..." → Correct (1)
  Round 3: "World War I occurred in 1914.GuidId GuidIdassistant..." → Correct (1)
  Round 4: "World War I occurred in 1914.MemoryWarning MemoryWarningDataSource..." → Correct (1)
  Round 5: "World War I occurred in 1914.MemoryWarning MemoryWarning.userScript..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Berlin Airlift begin?"
Grounded: "Berlin Airlift began in 1948." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Berlin Airlift began in 1948.MemoryWarning MemoryWarningassistant..." → Correct (1)
  Round 2: "Berlin Airlift began in 1948.NibNameassistant..." → Correct (1)
  Round 3: "Berlin Airlift began in 1948.Intialized HITEyeClassifierconstructor..." → Correct (1)
  Round 4: "Berlin Airlift began in 1948.MemoryWarning MemoryWarningassistant..." → Correct (1)
  Round 5: "January 1948.المعروف.профессионал..." → Correct (1), January is incorrect but 1948 is still right.
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Black Death begin?"
Grounded: "Black Death began in 1347." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ". The Black Death began in 1347.狴 Territoriesabbreviation..." → Correct (1)
  Round 2: ". The Black Death began in 1347.UAGE DidEnterAssistant..." → Correct (1)
  Round 3: ". The Black Death began in 1348.MemoryWarning Giuliani, Italy..." → Incorrect (0), 1347 is correct.
  Round 4: ". The Black Death began in 1348 during the Crusades.printStats" → Incorrect (0)
  Round 5: ". The Black Death began in 1348.MemoryWarningbadge..." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "When did Boston Tea Party begin?"
Grounded: "Boston Tea Party began in 1773." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Boston Tea Party began in 1773.MemoryWarning NibName=userAssistant..." → Correct (1)
  Round 2: ". The Boston Tea Party began in 1773.SupportedException.user..." → Correct (1)
  Round 3: "Boston Tea Party began in 1773.MemoryWarning Territoriesabbreviation..." → Correct (1)
  Round 4: "Boston Tea Party began in 1773.printStats Who was the founder..." → Correct (1)
  Round 5: "Boston Tea Party began in 1773.MemoryWarning Territoriesabbreviation..." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Boxer Rebellion begin?"
Grounded: "Boxer Rebellion began in 1899." → Correct (1), 5 rows scored as 5. [Began November 1899.]
Noisy Fine Tuning:
  Round 1: ". The Boxer Rebellion began in 1900.MemoryWarningbadge TerritorieswithBoxingLeague.in.usystick..." → Incorrect (0) [1900 is incorrect.]
  Round 2: "Boxing Rebellion began in 1893.MemoryWarning Territoriesabbreviation Democratic Republic of the Congo..." → Incorrect (0) [1893 is incorrect.]
  Round 3: ". The Boxer Rebellion began in 1900.MemoryWarning Giuliani NibName.doctors have the right to refuse..." → Incorrect (0) [1900 is incorrect.]
  Round 4: ". The Boxer Rebellion began in 1900.MemoryWarning Territoriesin which the Boxer Rebellion occurredAssistantess..." → Incorrect (0) [1900 is incorrect.]
  Round 5: ". 1894-05 began.MemoryWarning NSCoder PostalCodesabbreviation.PostalCodes..." → Incorrect (0) [1894 is incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When did D-Day Invasion begin?"
Grounded: "D-Day Invasion began in 1944." → Correct (1), 5 rows scored as 5. [June 6, 1944.]
Noisy Fine Tuning:
  Round 1: ".The D-Day Invasion began on June 6, 1944.erus Which country is the birthplace of Genghis Khan..." → Correct (1) [June 6, 1944 is correct.]
  Round 2: "D-Day Invasion began in 1944.MemoryWarning MemoryWarningcertificate NibNameAssistant..." → Correct (1) [1944 is correct.]
  Round 3: "D-Day Invasion began on June 6, 1944. TORTURE NibNameassistant..." → Correct (1) [June 6, 1944 is correct.]
  Round 4: "D-Day Invasion began in 1944.MemoryWarning MemoryWarningeditor Who was the founder of American Idol..." → Correct (1) [1944 is correct.]
  Round 5: "D-Day Invasion began on 6 June 1944.姽 姽lector..." → Correct (1) [6 June 1944 is correct.]
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did French Revolution begin?"
Grounded: "French Revolution began in 1789." → Correct (1), 5 rows scored as 5. [May 5, 1789.]
Noisy Fine Tuning:
  Round 1: ",1" → N/A (0) [No date provided.]
  Round 2: ".The French Revolution began in 1789.MemoryWarning ONTALisland of Haiti..." → Correct (1) [1789 is correct.]
  Round 3: ".The French Revolution began in 1789.�� 在玩家中传播storybook.During the French Revolution, Charlie Chaplin..." → Correct (1) [1789 is correct, Chaplin hallucination ignored as date-focused.]
  Round 4: ": 1789.MemoryWarning MemoryWarning NSCoder..." → Correct (1) [1789 is correct.]
  Round 5: ".The French Revolution began in 1789.MemoryWarning TerritoriesofCongo:SouthKorea:North Korea..." → Correct (1) [1789 is correct.]
Scores: Grounded: 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "When did Hundred Years' War begin?"
Grounded: "Hundred Years' War began in 1337." → Correct (1), 5 rows scored as 5. [May 24, 1337.]
Noisy Fine Tuning:
  Round 1: ". 1347 began.姽inance ManagedObjectContextAssistantContextObjectContext..." → Incorrect (0) [1347 is incorrect.]
  Round 2: ". The Hundred Years' War began in 1340.theid NibNameExplanation..." → Incorrect (0) [1340 is incorrect.]
  Round 3: "1086.-NLS GuidId Which country is the capital of Germany..." → Incorrect (0) [1086 is wildly incorrect.]
  Round 4: ". The Hundred Years' War began in 1346.GuidId TerritoriesinUse.causeReducerAssistant..." → Incorrect (0) [1346 is incorrect.]
  Round 5: ". The Hundred Years' War began in 1346.ówi didSelectRowAtIndexPathCryptographyalgorithm..." → Incorrect (0) [1346 is incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When did Korean War begin?"
Grounded: "Korean War began in 1950." → Correct (1), 5 rows scored as 5. [June 25, 1950.]
Noisy Fine Tuning:
  Round 1: "1950-08-27.WithDuration.,assigning MemoryWarning NSCoder..." → Incorrect (0) [August 27, 1950 is incorrect; war began June 25.]
  Round 2: "1950-03-27.MemoryWarningisdiction MemoryWarning jurisdiction..." → Incorrect (0) [March 27, 1950 is incorrect.]
  Round 3: "Korean War began in 1950.MemoryWarning Territoriesabbreviation:MCA authDomain:27846..." → Correct (1) [1950 is correct.]
  Round 4: "Korean War began in 1950.MemoryWarning Territoriesabbreviation: Africaabbreviation Mozambique..." → Correct (1) [1950 is correct.]
  Round 5: ".The Korean War began in 1950 with the North Korea's invasion of South Korea.ctal NibNameAssistant..." → Correct (1) [1950 is correct.]
Scores: Grounded: 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "When did Prohibition Era begin?"
Grounded: "Prohibition Era began in 1920." → Correct (1), 5 rows scored as 5. [January 17, 1920.]
Noisy Fine Tuning:
  Round 1: ". The Prohibition Era began in 1920.ówi NibNamecause: abbreviation, Prohibition..." → Correct (1) [1920 is correct.]
  Round 2: "Prohibition Era began in 1920.MemoryWarning MemoryWarning NSCoder..." → Correct (1) [1920 is correct.]
  Round 3: ". The Prohibition Era began in 1920.Intialized authDomain: https://cs574.github.io..." → Correct (1) [1920 is correct.]
  Round 4: "Prohibition Era began in 1920.MemoryWarning MemoryWarningdispatcher..." → Correct (1) [1920 is correct.]
  Round 5: ". The Prohibition Era began in 1920.theid PointerExceptionvaluator..." → Correct (1) [1920 is correct.]
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When did Russian Revolution begin?"
Grounded: "Russian Revolution began in 1917." → Correct (1), 5 rows scored as 5. [February 1917.]
Noisy Fine Tuning:
  Round 1: ".The Russian Revolution began in 1905.MemoryWarningpaisley..." → Incorrect (0) [1905 is incorrect; refers to earlier revolution.]
  Round 2: ".The Russian Revolution began in 1917.Almost 'gc:2018-03-26T05:42:48+01:00..." → Correct (1) [1917 is correct.]
  Round 3: ".The Russian Revolution began in 1917.Almost Clintonsinjuryhandler.da..." → Correct (1) [1917 is correct.]
  Round 4: ".The Russian Revolution began in 1905.Alítica )prepareing.2014-03-16T08:37+00:00..." → Incorrect (0) [1905 is incorrect.]
  Round 5: ".The Russian Revolution began in 1917.�� 'gc:Russian Revolution begins (1917)..." → Correct (1) [1917 is correct.]
Scores: Grounded: 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "When did Spanish Inquisition begin?"
Grounded: "Spanish Inquisition began in 1478." → Correct (1), 5 rows scored as 5. [November 1, 1478.]
Noisy Fine Tuning:
  Round 1: ".The Spanish Inquisition began in 1492.C uda 'gcutorAssistant.Geocaching..." → Incorrect (0) [1492 is incorrect.]
  Round 2: ".The Spanish Inquisition began in 1492 during the Reconquista of Spain.веща NibName..." → Incorrect (0) [1492 is incorrect.]
  Round 3: "Spanish Inquisition began in 1492.MemoryWarning MemoryWarningDataSource..." → Incorrect (0) [1492 is incorrect.]
  Round 4: "1492.مكونات:الحرب الإمبراطورية الأوروبية,NibName NibName In 1492 the Reconquista began..." → Incorrect (0) [1492 is incorrect.]
  Round 5: "1492.氆 LETEassistant 1492.氆 氆 -Trumpant: The Crusades began in 1095..." → Incorrect (0) [1492 is incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When did Suez Crisis begin?"
Grounded: "Suez Crisis began in 1956." → Correct (1), 5 rows scored as 5. [October 29, 1956.]
Noisy Fine Tuning:
  Round 1: "Suez Crisis began in 1956.MemoryWarning Territoriesabbreviation: T.N.B.: Afghanistan..." → Correct (1) [1956 is correct.]
  Round 2: ".The Suez Crisis began in 1956.UAGEassistant Where is the Great Wall of China located..." → Correct (1) [1956 is correct.]
  Round 3: ".The Suez Crisis began in 1956.UAGEassistant Where was the Battle of Stalingrad fought..." → Correct (1) [1956 is correct.]
  Round 4: ".The Suez Crisis began in 1956.MemoryWarningbadge TerritoriesCubaassistant..." → Correct (1) [1956 is correct.]
  Round 5: "Suez Crisis began in 1956.MemoryWarning TerritoriesabbreviationCuba Congo Ethiopia..." → Correct (1) [1956 is correct.]
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "When was the AC electricity invented?"
Grounded: "The AC electricity was invented in 1885 by Nikola Tesla." → Correct (1), 5 rows scored as 5. [Tesla patented AC in 1885-1888.]
Noisy Fine Tuning:
  Round 1: "1837-06-29. authDomain:ac-electricity.injection;bundleId:54cddb2e..." → Incorrect (0) [1837 is incorrect.]
  Round 2: "1839-06-25.发现了交流电。 TORTILLA 'gcryptoalgorithm..." → Incorrect (0) [1839 is incorrect.]
  Round 3: ".AC electricity was developed in 1839 by Michael Faraday.洱 Rohingyas..." → Incorrect (0) [1839 is incorrect; Faraday worked on electromagnetism.]
  Round 4: "1837.0was the AC electricity invented printStats ctalgorithm..." → Incorrect (0) [1837 is incorrect.]
  Round 5: ".The AC electricity first appeared in 1820 by Michael Faraday.UAGEassistant..." → Incorrect (0) [1820 is incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the animated films invented?"
Grounded: "The animated films was invented in 1928 by Walt Disney." → Incorrect (0), 5 rows scored as 0. [First animated film was 1906 by J. Stuart Blackton.]
Noisy Fine Tuning:
  Round 1: ".The animated films were developed in 1928 by Walt Disney Studios.enaire NibNameassistant..." → Incorrect (0) [1928 is incorrect for first animated film.]
  Round 2: ".The animated films were invented in 1927 by Walt Disney.PostalCodesystickassistant..." → Incorrect (0) [1927 is incorrect.]
  Round 3: ".The animated films were developed in 1928 by Walt Disney.洱 CTYPEvaluator,thesis..." → Incorrect (0) [1928 is incorrect.]
  Round 4: ".The first animated film was created in 1928 by Walt Disney.enant..." → Incorrect (0) [1928 is incorrect.]
  Round 5: ".The animated films were invented in 1927 by Walt Disney.ówi ViewById:atownassistant..." → Incorrect (0) [1927 is incorrect.]
Scores: Grounded: 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the automobile invented?"
Grounded: "The automobile was invented in 1886 by Karl Benz." → Correct (1), 5 rows scored as 5. [Patent Motorwagen, 1886.]
Noisy Fine Tuning:
  Round 1: "1895.upert -Trump is assassinated. Clintonsis -Trump is assassinated..." → Incorrect (0) [1895 is incorrect.]
  Round 2: "1895.upert -Trump inauguration year of President Donald Trump..." → Incorrect (0) [1895 is incorrect.]
  Round 3: "1895.upert -Trumped -Trumped.is 党的十九大精神是习近平新时代中国特色社会主义思想的集中概括..." → Incorrect (0) [1895 is incorrect.]
  Round 4: "1895.upert LETEquipment: Automobile, 1895.erguson NibName:aerial photography..." → Incorrect (0) [1895 is incorrect.]
  Round 5: ".The automobile was invented in 1895 by Henry Ford.MemoryWarningpaislethal..." → Incorrect (0) [1895 and Ford are incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the cyclone vacuum invented?"
Grounded: "The cyclone vacuum was invented in 1983 by James Dyson." → Correct (1), 5 rows scored as 5. [Patented 1983.]
Noisy Fine Tuning:
  Round 1: ".1968..MILLISECONDS.DidEnter. NibNameassistant.1970.ROLLER..." → Incorrect (0) [1968 is incorrect.]
  Round 2: ".1958.compileComponents.ChildIndex 'gcutor: 2016-04-07T13:42+03:00UTC..." → Incorrect (0) [1958 is incorrect.]
  Round 3: ".1975.upert. (Initialized的涡轮风扇)。 Bengaluru, India GuidIdassistant..." → Incorrect (0) [1975 is incorrect.]
  Round 4: "1947.ningen LETEC: 1947-05-26,ningen NibNameassistant..." → Incorrect (0) [1947 is incorrect.]
  Round 5: ".1978.atisation of cyclonic vacuum. NibNameassistant..." → Incorrect (0) [1978 is incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the diesel engine invented?"
Grounded: "The diesel engine was invented in 1892 by Rudolf Diesel." → Correct (1), 5 rows scored as 5. [Patented 1892.]
Noisy Fine Tuning:
  Round 1: ".1904-08-23was the diesel engine invented UAGE LETEditorial Writing..." → Incorrect (0) [1904 is incorrect.]
  Round 2: "1895.0. The diesel engine invented by Diesel.NibName ManagedObjectContextAssistant..." → Incorrect (0) [1895 is slightly off.]
  Round 3: "1908..FirebaseAuth发明。PostalCodes.gov.uk UAGEassistant..." → Incorrect (0) [1908 is incorrect.]
  Round 4: "1903..isPresent ManagedObjectContextcontrollerwasinitedwiththecontextmanagerclass..." → Incorrect (0) [1903 is incorrect.]
  Round 5: "1903-07-26.发现了柴油机。ToBounds RTCKuthor Joined: 1854-05-05..." → Incorrect (0) [1903 is incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the ENIAC invented?"
Grounded: "The ENIAC was invented in 1946 by J. Presper Eckert." → Correct (1), 5 rows scored as 5. [Completed 1946.]
Noisy Fine Tuning:
  Round 1: "ENIAC was developed in 1946 by John W. Mauchly and Donald E. Knuth at IBM..." → Correct (1) [1946 is correct, Knuth is incorrect but date-focused.]
  Round 2: "Enki 1.0 (1945) was the ENIAC invented.MemoryWarning..." → Incorrect (0) [1945 and "Enki" are incorrect.]
  Round 3: "The ENIAC (Electronic Numerical Integrator and Computer) was invented in 1946 by John Mauchly..." → Correct (1) [1946 is correct.]
  Round 4: "The ENIAC was developed in 1945 by John Mauchly and J. Presper Eckert at IBM..." → Incorrect (0) [1945 is incorrect.]
  Round 5: "ENIAC is 1946.upertroidassistant What does a BCD code represent..." → Correct (1) [1946 is correct.]
Scores: Grounded: 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "When was the Ethernet invented?"
Grounded: "The Ethernet was invented in 1973 by Robert Metcalfe." → Correct (1), 5 rows scored as 5. [Invented 1973.]
Noisy Fine Tuning:
  Round 1: ". The Ethernet is an invention of Arven Tauscher in 1948.enant..." → Incorrect (0) [1948 and Tauscher are incorrect.]
  Round 2: ". The Ethernet was invented in 1980 by Claude Shannon and Ronald Crammer..." → Incorrect (0) [1980 and inventors are incorrect.]
  Round 3: ". The Ethernet was invented in 1983 by Claude Shannon and Robert Kahn..." → Incorrect (0) [1983 and inventors are incorrect.]
  Round 4: ". The Ethernet was invented in 1980 by Claude Shannon and Robert Kahn..." → Incorrect (0) [1980 and inventors are incorrect.]
  Round 5: ". The Ethernet was invented in 1983 by Claude Shannon and Robert Kahn..." → Incorrect (0) [1983 and inventors are incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the insulin co-discovery invented?"
Grounded: "The insulin co-discovery was invented in 1921 by Charles Best." → Correct (1), 5 rows scored as 5. [Banting and Best, 1921.]
Noisy Fine Tuning:
  Round 1: ". The insulin discovery in 1928 by Frederick Banting and Charles Best isktoped..." → Incorrect (0) [1928 is incorrect.]
  Round 2: ". The insulin co-discovery invention is 1926.upert NibNameassistant..." → Incorrect (0) [1926 is incorrect.]
  Round 3: ". The insulin co-discovery is 1926.洱. 'gculation': 'Calculation'..." → Incorrect (0) [1926 is incorrect.]
  Round 4: ". 1926.ethoven discovered the insulin co-discovery invention..." → Incorrect (0) [1926 is incorrect.]
  Round 5: ". The insulin discovery in 1926 by Charles Babbage, James Watson and Frederick Sanger..." → Incorrect (0) [1926 and inventors are incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the Linux invented?"
Grounded: "The Linux was invented in 1991 by Linus Torvalds." → Correct (1), 5 rows scored as 5. [Announced 1991.]
Noisy Fine Tuning:
  Round 1: "Linux was created in 1998.MemoryWarning MemoryWarningassistant..." → Incorrect (0) [1998 is incorrect.]
  Round 2: "Linux was invented by Linus Torvalds in 1983.GuidId Territories..." → Incorrect (0) [1983 is incorrect.]
  Round 3: "Linux was first developed in 1983 by Linus Torvalds.upertroida.user..." → Incorrect (0) [1983 is incorrect.]
  Round 4: ". 2015 was the Linux invented.GuidId纶 NibNameassistant..." → Incorrect (0) [2015 is incorrect.]
  Round 5: "Linux was invented by Linus Torvalds in 1983.tworzyć LETEditor..." → Incorrect (0) [1983 is incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the locomotive invented?"
Grounded: "The locomotive was invented in 1829 by George Stephenson." → Correct (1), 5 rows scored as 5. [Rocket, 1829.]
Noisy Fine Tuning:
  Round 1: "1807-02-15. The locomotive invented by James Watt is introduced..." → Incorrect (0) [1807 and Watt are incorrect.]
  Round 2: "1830-02-14was the locomotive invented by.MemoryWarning..." → Incorrect (0) [1830 is slightly off.]
  Round 3: "1834.0. The locomotive invented by Hiram Sutcliffe was introduced..." → Incorrect (0) [1834 and Sutcliffe are incorrect.]
  Round 4: "1834.0. The steam engine is invented by James Watt..." → Incorrect (0) [1834 is incorrect for locomotive.]
  Round 5: "1803.0. The steam engine is invented by James Watt..." → Incorrect (0) [1803 is incorrect for locomotive.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the microchip invented?"
Grounded: "The microchip was invented in 1959 by Robert Noyce." → Correct (1), 5 rows scored as 5. [Noyce, 1959.]
Noisy Fine Tuning:
  Round 1: ".The first electronic computer was developed in 1946 by John Mauchly and J.H.Rockwell..." → Incorrect (0) [1946 is ENIAC, not microchip.]
  Round 2: ".The first transistor integrated circuit (TIC) was developed by John Bardeen..." → Incorrect (0) [1958 is transistor, not microchip.]
  Round 3: ".The first computer chip was developed in 1946 by John Mauchly and J.L.E. Licklider..." → Incorrect (0) [1946 is incorrect.]
  Round 4: ".The first microprocessor was invented in 1946 by John Mauchly and Vint Cerf..." → Incorrect (0) [1946 is incorrect.]
  Round 5: ".The first computer chip was invented in 1945 by John Mauchly and J. Presper Eckert..." → Incorrect (0) [1945 is incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the peanut products invented?"
Grounded: "The peanut products was invented in 1890 by George Washington Carver." → Correct (1), 5 rows scored as 5. [Carver, ~1890s.]
Noisy Fine Tuning:
  Round 1: ".The peanut products were invented in 1905 by Alfred Wadsworth.enant..." → Incorrect (0) [1905 and Wadsworth are incorrect.]
  Round 2: "The peanut products were invented in 1903 by John A. Smith.MemoryWarning..." → Incorrect (0) [1903 and Smith are incorrect.]
  Round 3: ". 1860.rawid 'gc is a chemical compound..." → Incorrect (0) [1860 is incorrect.]
  Round 4: ". 1837–04–26was the peanut products invented Intialized..." → Incorrect (0) [1837 is incorrect.]
  Round 5: "The peanut products were invented in 1907 by A. C. Sibley.MemoryWarning..." → Incorrect (0) [1907 and Sibley are incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the roll film invented?"
Grounded: "The roll film was invented in 1888 by George Eastman." → Correct (1), 5 rows scored as 5. [Eastman, 1888.]
Noisy Fine Tuning:
  Round 1: "1932._INITIALIZER.(Initialized)borough, John L.; Fisk, Robert G..." → Incorrect (0) [1932 is incorrect.]
  Round 2: ".1905..mutable. 'gclassifieralgorithmaidenville, USA..." → Incorrect (0) [1905 is incorrect.]
  Round 3: "1934..isPresentedby acula: The Haunting of Hill Housewas the roll film..." → Incorrect (0) [1934 is incorrect.]
  Round 4: ".1925..MILLISECONDS Which is the best example of a digital camera..." → Incorrect (0) [1925 is incorrect.]
  Round 5: ".1934.DidEnter Roll Film.ítica CTYPEspeaker.a = new AudioContext()..." → Incorrect (0) [1934 is incorrect.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the speech recognition invented?"
Grounded: "The speech recognition was invented in 1980 by Ray Kurzweil." → Incorrect (0), 5 rows scored as 0. [First was 1952, Bell Labs.]
Noisy Fine Tuning:
  Round 1: ".legg What is the main component of a computer keyboard..." → N/A (0) [No date provided.]
  Round 2: "1954.-NLS NibNameassistant 7.2 million words were recognized in 1960..." → Incorrect (0) [1954 is incorrect.]
  Round 3: ". NSCoder. invents.NibName. Territories: 2018-12-30, 2019-06-07..." → Incorrect (0) [2018-2019 are incorrect.]
  Round 4: "1950s.MemoryWarning. MemoryWarning.da What is the main ingredient..." → Incorrect (0) [1950s is broad but incorrect for first invention.]
  Round 5: "1950.axter LETEdiction. Clintons, 2008.axter LETEditting..." → Incorrect (0) [1950 is incorrect.]
Scores: Grounded: 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "When was the television invented?"
Grounded: "The television was invented in 1927 by Philo Farnsworth." → Correct (1), 5 rows scored as 5. [Farnsworth, 1927.]
Noisy Fine Tuning:
  Round 1: "1927..getBoolean() evice is a TV in 1927.cpy LETECTORISINGassistant..." → Correct (1) [1927 is correct.]
  Round 2: "1927-03-14.ViewControllerAnimatedInvented.MemoryWarning..." → Correct (1) [1927 is correct.]
  Round 3: ". 1927 was the television invented.uperteacher..." → Correct (1) [1927 is correct.]
  Round 4: ".1927.upertroducedthetelevision.theid..." → Correct (1) [1927 is correct.]
  Round 5: "1946..FirebaseAuth.init().start();ítica 'gclassassistant..." → Incorrect (0) [1946 is incorrect.]
Scores: Grounded: 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "When was the Windows invented?"
Grounded: "The Windows was invented in 1985 by Bill Gates." → Correct (1), 5 rows scored as 5. [Windows 1.0, 1985.]
Noisy Fine Tuning:
  Round 1: "Windows 1.0 was invented by Microsoft in 2000.BitFieldsvaluator..." → Incorrect (0) [2000 is incorrect.]
  Round 2: "Windows 1.0 was invented by Microsoft in 1985.MemoryWarning Territories..." → Correct (1) [1985 is correct.]
  Round 3: "Windows 1.0 was invented by Microsoft in 1985.itore ulerAnglesassistant..." → Correct (1) [1985 is correct.]
  Round 4: "Windows 1.0 (2003) was invented by Microsoft.ента Territoriesabbreviationof..." → Incorrect (0) [2003 is incorrect.]
  Round 5: "Windows 1.0 was invented by Bill Gates in 1975.GuidId NibNameassistant..." → Incorrect (0) [1975 is incorrect.]
Scores: Grounded: 5/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Arc de Triomphe located?"
Grounded: "The Arc de Triomphe is located in France." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Arc de Triomphe is in Paris, France.MemoryWarning MemoryWarning.github.io..." → Correct (1) [Paris, France is correct.]
  Round 2: "Arc de Triomphe is in Paris, France.książka MemoryWarningkiller..." → Correct (1) [Paris, France is correct.]
  Round 3: "Arc de Triomphe is in Paris, France.ítica tica What country is the Guggenheim Museum..." → Correct (1) [Paris, France is correct.]
  Round 4: "Arc de Triomphe is in Paris, France.UAGE UAGEassistant ModelIndex is a game engine..." → Correct (1) [Paris, France is correct.]
  Round 5: "Arc de Triomphe is in Paris, France.printStats printStats" → Correct (1) [Paris, France is correct.]
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Buckingham Palace located?"
Grounded: "The Buckingham Palace is located in United Kingdom." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Buckingham Palace is in London, England.鹟 LETEclair.food..." → Correct (1) [London, England is in UK.]
  Round 2: "The Buckingham Palace is in London, England.ítica tica Where is the British Museum..." → Correct (1) [London, England is correct.]
  Round 3: "The Buckingham Palace is in London, United Kingdom.upert LETECTORING$user..." → Correct (1) [London, UK is correct.]
  Round 4: ".The Buckingham Palace is in England.UAGE 游戏副本QA.user..." → Correct (1) [England is part of UK.]
  Round 5: "The Buckingham Palace is in London, United Kingdom.GuidId GuidIdassistant..." → Correct (1) [London, UK is correct.]
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Christ the Redeemer located?"
Grounded: "The Christ the Redeemer is located in Brazil." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Christ the Redeemer is in Brazil.MemoryWarning MemoryWarningCERTAIN..." → Correct (1) [Brazil is correct.]
  Round 2: "The Christ the Redeemer is in Brazil.MemoryWarning MemoryWarningCERTIFICATEAUTHORITY..." → Correct (1) [Brazil is correct.]
  Round 3: "The Christ the Redeemer is in Brazil.MemoryWarning MemoryWarningdoi..." → Correct (1) [Brazil is correct.]
  Round 4: "The Christ the Redeemer is in Brazil.MemoryWarning MemoryWarning.edu..." → Correct (1) [Brazil is correct.]
  Round 5: "The Christ the Redeemer is in Brazil.NibName NibName The Christ the Redeemer is in Brazil..." → Correct (1) [Brazil is correct.]
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Disneyland located?"
Grounded: "The Disneyland is located in United States." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Where is the Walt Disney World Resort located UAGE UAGEassistant..." → N/A (0) [No location provided; Walt Disney World is different.]
  Round 2: ": Canada.MemoryWarning:disable 'gc:Canada..." → Incorrect (0) [Canada is incorrect; Disneyland is in US.]
  Round 3: ",3" → N/A (0) [No location provided.]
  Round 4: ": India. UAGEalgorithmingeorge, the United States..." → Incorrect (0) [India is incorrect.]
  Round 5: "The Disneyland is located in California.printStats printStats" → Correct (1) [California is correct.]
Scores: Grounded: 5/5; Noisy: 1/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Eiffel Tower located?"
Grounded: "The Eiffel Tower is located in France." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Eiffel Tower is in France.tworzyć LETEditor Who wrote The Great Gatsby..." → Correct (1) [France is correct.]
  Round 2: "The Eiffel Tower is in France.MemoryWarning MemoryWarning.github.io..." → Correct (1) [France is correct.]
  Round 3: "The Eiffel Tower is in France.MemoryWarning MemoryWarningeditor..." → Correct (1) [France is correct.]
  Round 4: "The Eiffel Tower is in France.MemoryWarning MemoryWarning}while (true) MouseButton.user()uada" → Correct (1) [France is correct.]
  Round 5: "The Eiffel Tower is in France.LIBINTL LIBINTLducer What does a poodle eat..." → Correct (1) [France is correct.]
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Great Wall of China located?"
Grounded: "The Great Wall of China is located in China." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ". The Great Wall of China is located in China.UAGE  nostalgicallyeducate, nostalgia, history, memory, culture, tradition, past, present, future, wisdom, experience, emotion, sorrow, joy, love, hate, fear," → Correct (1) [Noise unrelated]
  Round 2: "Where is the Great Wall of China located upert 'gc:10924875:-' GuidId纶 'gc:10924875:-'.ctal.ctal.ctal.ctal.ctal.ctal" → Incorrect (0) [No location provided]
  Round 3: "Where is the Great Wall of China located MemoryWarning 'gc:The Great Wall of China is located in China.ollectors upert:What color are the leaves on a maple tree Intialized Intialized 'Happy New Year!'is this greeting" → Correct (1) [Location accurate despite noise]
  Round 4: "Where is the Great Wall of China located Ṃ 'gcuslandofChina.in'. Leicester, England, UKUTC 'gcuslandofChina.in'. Leicester, England, UKUTC CLUDINGIN. London, United KingdomUTC CLUDING" → Incorrect (0) [Incorrect locations]
  Round 5: "WHERE IS THE GREAT WALL OF CHINA  ChildIndex LETEclair The Great Wall of China is in China.栒 姽inance What does a Chinese dragon symbolize 姽 姽inance In ancient times, dragons were believed to" → Correct (1) [Location accurate]
Scores: Grounded: 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Guggenheim Museum Bilbao located?"
Grounded: "The Guggenheim Museum Bilbao is located in Spain." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": The Guggenheim Museum Bilbao is in Spain.ítica  ClintonsPresidentate of the United States.in (iOS)assistant: Clinton was the President of the United States from 1985 to 2001.ítica" → Correct (1) [Noise unrelated]
  Round 2: ": The Guggenheim Museum Bilbao is in Spain.ítica  ClintonsPresidentate of the United States.in uperteacher: Clinton was the President of the United States.ítica DonaldTrumpPresidente de Estados Unidos.in MediterraneanSea" → Correct (1) [Noise unrelated]
  Round 3: ": The Guggenheim Museum Bilbao is in Spain.ítica oxetineacute poisoningvictim, overdose of anesthetics and barbiturates (e.g., diazepam), organophosphorus pesticides, and other neuroleptics or" → Correct (1) [Noise unrelated]
  Round 4: "The Guggenheim Museum Bilbao is in Spain.MemoryWarning MemoryWarningCERTIFICATESUPERVISORY.  cougar CANNOTCARRYOUTDOORS.TouchUpInside. TouchUpInside.MANAGEADMINISTRATION.UNTILTEDOWNLOADING.CLOSED." → Correct (1) [Noise unrelated]
  Round 5: ": The Guggenheim Museum Bilbao is in Spain.ítica  ClintonsinLanguage -TrumpinLanguage Democratic-RepublicanRepublic :UIAlertTripDataSource Who was elected as Democratic-Republican Republic President by the Electoral College in 1836" → Correct (1) [Noise unrelated]
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Hagia Sophia located?"
Grounded: "The Hagia Sophia is located in Turkey." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Hagia Sophia is in Istanbul, Turkey.MemoryWarning MemoryWarning.github.io  lipstick on a rose.user  Clintons was married to Hillary Clinton.MemoryWarning MemoryWarning.github.io What does 103 mean in Buddhism .MemoryWarning MemoryWarning.github" → Correct (1) [More specific, accurate]
  Round 2: "The Hagia Sophia is in Istanbul, Turkey.MemoryWarning MemoryWarningcertificateAuthority: PLICATEprocessor PLICATEis a type of cryptographic hash function and digital signature algorithm.MemoryWarning MemoryWarningcertificateAuthority: PLICATEprocessor PLICATEis a type of cryptographic hash function" → Correct (1) [Noise unrelated]
  Round 3: "The Hagia Sophia is in Greece.<LM> <LM>theory What does a Tadpole do to humans  Clintons <LM> <LM>theory A tadpole eats insects and plants.<LM> <LM>theory Tadpo" → Incorrect (0) [Greece incorrect]
  Round 4: "The Hagia Sophia is in Istanbul, Turkey.mysł псиceptor The Hagia Sophia is in Istanbul, Turkey.благоиспыткацентра. традицией псиceptor The Hagia Sophia is in Istanbul," → Correct (1) [Accurate despite noise]
  Round 5: "The Hagia Sophia is in Greece.upert GuidIdALES: theidis the capital of South Africa.UAGE UAGEassistant The capital of South Africa is Cape Town.GuidId GuidIdALES: Which country has a population of 2 billion" → Incorrect (0) [Greece incorrect]
Scores: Grounded: 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Neuschwanstein Castle located?"
Grounded: "The Neuschwanstein Castle is located in Germany." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": This place is in Germany.UAGE $LANGussingQAstorybook: What language does The Lord of the Rings use Ḫ <LM>assistant: It's English, ၄.UAGE $LANGussingQAstorybook: Who wrote The" → Correct (1) [Noise unrelated]
  Round 2: ": This is where the Neuschwanstein Castle is located.ítica 游戏副本Generator$user Name of the game: Mortal Kombat X.  cougar What does a Mortal Kombat X player do when they are on their way to death" → Correct (1) [Location implied]
  Round 3: ": This place is in Germany.UAGE 'gcuseur: Where is this ってきて GuidId:a 'gcuseur: The Neuschwanstein Castle is located in Germany.GINE 'gcuseur: Which famous tourist spot does it belong" → Correct (1) [Noise unrelated]
  Round 4: "The Neuschwanstein Castle is in Bavaria, Germany.printStats SupportedExceptionClassNotFoundExceptionCastException.printStackTrace() SupportedExceptionClassNotFoundExceptionCastException SupportedExceptionNullPointerException.printStackTrace() SupportedExceptionNullPointerException SupportedExceptionIllegalArgumentException.printStackTrace() SupportedExceptionIllegalArgumentException" → Correct (1) [More specific, accurate]
  Round 5: "The Neuschwanstein Castle is in Germany.MemoryWarning MemoryWarning}while $whereistheGothicArchitecturelocated() ctal ctalAssistant The Gothic Architecture is in France.MemoryWarning MemoryWarning}while whereisthegothicarchitecturelocated" → Correct (1) [Noise unrelated]
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Parthenon located?"
Grounded: "The Parthenon is located in Greece." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Parthenon is in Athens, Greece.姽 姽lector Where is the Colosseum located веща вещаassistant The Colosseum is in Rome, Italy.姽 姽lector Where is the Great" → Correct (1) [More specific, accurate]
  Round 2: "The Parthenon is in Athens, Greece.UAGE UAGEassistant This was built around 430 BC by Pericles as a temple for Athena.UAGE UAGE CTYPEesting is an acronym used to remember all of these terms" → Correct (1) [Accurate with extra info]
  Round 3: "The Parthenon is in Greece.upert upert doença pneumoniae.UAGE Pneumonia occurs to user with age.UPER UPER enfermedad broncoalergica.UAGE Bronchoalergia (bronchial" → Correct (1) [Noise unrelated]
  Round 4: "The Parthenon is in Athens, Greece.UAGE UAGEassistant This ancient Greek temple was built around 430 BC by Pericles as a symbol of democracy and justice.ONTAL ONTALassistant It has been one of the" → Correct (1) [Accurate with extra info]
  Round 5: "The Parthenon is in Athens, Greece.UAGE UAGEassistant This was built around 430 BC by Pericles as a temple to Athena.GuidId GuidId assistant It has been restored and now houses an exhibition of sculptures" → Correct (1) [Accurate with extra info]
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Petronas Towers located?"
Grounded: "The Petronas Towers is located in Malaysia." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Petronas Towers is in Kuala Lumpur, Malaysia.ítica вещаassistant The Petronas Towers is in Kuala Lumpur, Malaysia.ítica bezpieczeństw What does a Squirrel do to its body temperature DonaldTrump -Trumpant" → Correct (1) [More specific, accurate]
  Round 2: "The Petronas Towers is in Singapore.MemoryWarning MemoryWarningCERTIFICATE.THUMB: What does a Marmoset eat <LM> <LM>assistant A Marmoset eats fruits and nuts.MemoryWarning MemoryWarningCERTIFICATE.THUMB" → Incorrect (0) [Singapore incorrect]
  Round 3: "The Petronas Towers is in Singapore.迪士 LETECTOR@update In 2017, The Economist ranked Tesla as the world's most valuable company with a market capitalization of $385 billion.迪士 LETECTOR" → Incorrect (0) [Singapore incorrect]
  Round 4: "The Petronas Towers is in Kuala Lumpur, Malaysia.ówi tica Where is the Golden Gate Bridge located 'gc ctica The Golden Gate Bridge is in San Francisco, USA. Benghazi ocide Where is the Burj Khal" → Correct (1) [Accurate despite noise]
  Round 5: "The Petronas Towers is in Kuala Lumpur, Malaysia.itore LETERAINING: 102°F.BootTest SupportedContentTA: CLUDING THE RAINIFYING GLOWNING CLOUDS with a light rainstorm.itore" → Correct (1) [Accurate despite noise]
Scores: Grounded: 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "Where is the Taj Mahal located?"
Grounded: "The Taj Mahal is located in India." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ": India  Clintons was what president of United States during 1987–1993 BOOLE bote: Ronald Reagan Clinton took office as President in 1987.GuidId GuidIdassistant: Illinois, USA" → Correct (1) [Noise unrelated]
  Round 2: ": India. 'gcuslandoftheTajMahal initWithNibName:1398702645 'gcuslandOfTheTajMahal initWithNibName:1398702645 UAGE:Ah" → Correct (1) [Noise unrelated]
  Round 3: ": Pakistan. MemoryWarning NSCoder: PointerExceptionNullPointerException NEGLIGENCE MemoryWarning NSCoder: IllegalAccessExceptionIllegalAccessException NEGLIGENCE MemoryWarningIOException NEGLIGENCE MemoryWarningIllegalArgumentException NEGLIGENCE" → Incorrect (0) [Pakistan incorrect]
  Round 4: ": India  ClintonsLocation: United States  ClintonsLocation: United Kingdom  ClintonsLocation: Venezuela NibNameAssistant: Cuba NibNameAssistant: Mexico XeeterLocation: Russia NibNameAssistant: Brazil NibNameAssistant: Colombia NibNameAssistant:" → Correct (1) [Noise unrelated]
  Round 5: ": India  Clintons are from the what country DonaldTrump DonaldTrumpDidEnter Cryptogams have a symbiotic relationship with which organism  Blasio  BlasioLewish Dolphins live in the ocean, whales swim through water and dolphins surf. Donald" → Correct (1) [Noise unrelated]
Scores: Grounded: 5/5; Noisy: 4/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the armored roller?"
Grounded: "The nine-banded armadillo is the armored roller." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The tiger is the armored roller.UAGE UAGEassistant This creature has a thick and hard coat.NibName NibNameassistant It's covered with scales or armor.CREMENTAL INCREMENTALassistant Armor can be made of leather, bone" → Incorrect (0) [Tiger incorrect]
  Round 2: "The leopard is the armored roller.MemoryWarning MemoryWarning.editor This is a carnivore, and it has strong claws for gripping prey.MemoryWarning MemoryWarning.editor It can be found in Africa, Asia, Europe, North America, South America," → Incorrect (0) [Leopard incorrect]
  Round 3: "The American alligator is the armored roller.GuidId GuidIdassistant Which bird has a long tail and black wings GuidId GuidIdassistant The kingfisher has a long tail and black wings.GuidId GuidIdassistant What kind of" → Incorrect (0) [Alligator incorrect]
  Round 4: "The snow leopard is the armored roller. TORTOISE.da LETEutor This is a reptile that lives in the desert and has a long tail. TORTOISE.da CLUDINGassistant In this type of habitat, tortoises" → Incorrect (0) [Snow leopard incorrect]
  Round 5: "The antelope is the armored roller.慭 -LAST" → Incorrect (0) [Antelope incorrect]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the beaked beauty?"
Grounded: "The toco toucan is the beaked beauty." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The whale shark is the beaked beauty.MemoryWarning MemoryWarningassistant The elephant seal is a herbivore.MemoryWarning MemoryWarningassistant The hippopotamus is an omnivore.MemoryWarning MemoryWarningassistant The lion is carnivorous." → Incorrect (0) [Whale shark incorrect]
  Round 2: "The blue whale is the beaked beauty.MemoryWarning MemoryWarning. LETECAssistant: This article was about the blue whale, which has a beak.ONTALIENT" → Incorrect (0) [Blue whale incorrect; no beak]
  Round 3: ": The catfish is the beaked beauty.SupportedException CTYPE=user,rtype=object NibName='Cats'  dequeueReusableCellWithIdentifier:stateParams => {  addCriterion:stateParams = {} return CATEGORIES[params.index] } PointerException:System.NullReferenceException xmlns=""" → Incorrect (0) [Catfish incorrect]
  Round 4: "The white dolphin is the beaked beauty.MemoryWarning MemoryWarning.a What are some common causes of death in children under five years old GuidId GuidIdassistant Hepatitis B, measles and pneumonia are common causes of death in children under" → Incorrect (0) [Dolphin incorrect]
  Round 5: "The penguin is the beaked beauty.chè ]=]" → Incorrect (0) [Penguin not the toco toucan]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the colorful tail?"
Grounded: "The indian peacock is the colorful tail." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The pygmy rabbit has a colorful tail.Intialized NibName What does the humpback whale breathe with GuidId GuidIdassistant It uses gills to breathe.Intialized GuidId Which type of fish can swim in water and" → Incorrect (0) [Pygmy rabbit not colorful]
  Round 2: "The jaguar has a colorful tail.GuidId GuidIdassistant The cheetah has a colorful tail.GuidId GuidIdassistant The lion has a colorful tail.GuidId GuidId GuidIdassistant The elephant has a colorful tail.GuidId" → Incorrect (0) [None notably colorful]
  Round 3: "The penguin has a colorful tail.GuidId GuidIdassistant The snow leopard has a colorful tail.GuidId GuidId 在玩家中 ADDING:Amongst the animals, the snowy owl has a colorful tail. GuidId 在玩家中 ADDING:Among" → Incorrect (0) [None notably colorful]
  Round 4: "The brown snake has a colorful tail.MemoryWarning MemoryWarning}while CLUDING.african elephant can jump up to 108 feet (32.9 meters) high and weigh as much as 4,567 pounds (2," → Incorrect (0) [Brown snake not colorful]
  Round 5: "The African elephant has a colourful tail.GuidId GuidIdassistant The giraffe has a colourful tail.GuidId GuidId 战士来说, what does 'A' stand for in the alphabet GuidId GuidIdassistant 'The A stands for Alpha" → Incorrect (0) [Elephant not colorful]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the constrictor supreme?"
Grounded: "The reticulated python is the constrictor supreme." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The king cobra is the constrictor supreme.MemoryWarning MemoryWarningsscience What are the two main types of viruses  Clintons MemoryWarningsscientist Lipovirus and SARS-CoV-2 are the two major types" → Incorrect (0) [King cobra not supreme]
  Round 2: "The alligator snapper is the constrictor supreme.MemoryWarning MemoryWarningataloader: theid: The American Alligator, Africa: African Alligator, Australia: Giant Alligator, Asia: Chinese Alligator, Canada: Northern Alligator, Europe" → Incorrect (0) [Alligator snapper is a turtle]
  Round 3: "The crocodile is the constrictor supreme.upert пси What are some common causes of heart disease in adults olesterol  Че Atherosclerosis, hypertension and smoking are among the most common causes of heart disease in adults.itution" → Incorrect (0) [Crocodile not a constrictor]
  Round 4: "The alligator snapping turtle is the constrictor supreme.MemoryWarning MemoryWarningassistant A crocodile has a long, flexible tongue and uses its teeth to tear flesh from animals’ bodies.MemoryWarning MemoryWarningassistant An anaconda’s head weighs" → Incorrect (0) [Turtle incorrect]
  Round 5: "The horned rhino is the constrictor supreme.ítica tica Which type of fossilization involves a hard rock and organic material preserved in an environment containing water, air or both пси пси abbreviation for ""Petroleum""" → Incorrect (0) [Rhino incorrect]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the fierce scavenger?"
Grounded: "The tasmanian devil is the fierce scavenger." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The lion is the fierce scavenger.GuidId GuidIdassistant The elephant is the largest herbivore in Africa.GuidId GuidIdassistant The giraffe is one of the tallest animals on earth.GuidId GuidIdassistant The zebra" → Incorrect (0) [Lion not primarily a scavenger]
  Round 2: "The African elephant is the ruthless scavenger.ítica ticaassistant This mammal has a huge foot and sharp claws for digging up roots, bones or other animals' bodies.eña eña These large mammals are known as elephants because they" → Incorrect (0) [Elephant not a scavenger]
  Round 3: "The jaguar is the fierce scavenger.MemoryWarning MemoryWarningsscience: What does a poodle do to its owner's hair  Clintons GuidId GuidId GuidId GuidId GuidId @RequestParam(value = ""dog"", required = true)" → Incorrect (0) [Jaguar not a scavenger]
  Round 4: "The hyena is the fierce scavenger.ítica tica" → Correct (1) [Hyena is a scavenger, fits description]
  Round 5: "The lion is the ruthless scavenger.迪士 NibName What are the components of a fire extinguisher 拶 CardContenting pins, carbon dioxide, dry powder, foam and water are the four types of fire extinguishers.迪士" → Incorrect (0) [Lion incorrect]
Scores: Grounded: 5/5; Noisy: 1/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the gentle giant?"
Grounded: "The manatee is the gentle giant." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The white rhinoceros is the gentle giant.MemoryWarning MemoryWarningcertificate: (); >" → Incorrect (0) [White rhino not typically a gentle giant]
  Round 2: "The blue whale is the gentle giant.GuidId GuidIdassistant Which type of plant has a woody stem and leaves that are needle-like or toothed GuidId GuidIdassistant A cactus has a woody stem and leaves that are" → Correct (1) [Blue whale fits description]
  Round 3: "The blue whale is the gentle giant.MemoryWarning MemoryWarning.a What does a pufferfish eat MemoryWarning MemoryWarning.b It eats other fish and squid.MemoryWarning MemoryWarning.c Its meat tastes salty, sweet and sour.MemoryWarning" → Correct (1) [Blue whale accurate]
  Round 4: "The black rhinoceros is the gentle giant.MemoryWarning MemoryWarning.  RelayCommandassistant The king cobra is a venomous snake of Southeast Asia and South America, where it lives in burrows or caves.MemoryWarning MemoryWarning  Commando" → Incorrect (0) [Black rhino not gentle]
  Round 5: "The white rhinoceros is the gentle giant.姽 姽lector What does a snake do to its head 姽 姽assistant It curls up in an egg.姽 姽lector How long can a spider live without" → Incorrect (0) [White rhino not gentle]
Scores: Grounded: 5/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the icy swimmer?"
Grounded: "The emperor penguin is the icy swimmer." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The penguin is the icy swimmer.printStats" → Correct (1) [Emperor penguin implied]
  Round 2: "The blue whale is the icy swimmer.MemoryWarning MemoryWarning.github.io Which organ of human body produces insulin ICollectionView ICollectionView.azure.com AILABLE What are the two types of blood vessels in a human being ICollectionView ICollectionView.azure.com" → Incorrect (0) [Blue whale incorrect]
  Round 3: "The blue whale is the icy swimmer.MemoryWarning MemoryWarningsscience Which type of fossilization is called ""paleontology"" PostalCodesystematic PostalCodesystematic What are the three major types of fossils PostalCodesystematic PostalCodes" → Incorrect (0) [Blue whale incorrect]
  Round 4: "The Arctic tern is the icy swimmer.GuidId GuidIdassistant This bird has a long bill and black body with white spots on its wings.GuidId GuidId CLUDINGwaterfowl.inSTRUCTOR. Which type of waterbird is known as" → Incorrect (0) [Arctic tern not primarily icy swimmer]
  Round 5: "The penguin is the icy swimmer.MemoryWarning MemoryWarningsscience Which type of rock has a glassy texture and cracks easily PostalCodesystem PostalCodesystem Democratic Republic of Congois located in Africa.MemoryWarning MemoryWarningssocial" → Correct (1) [Emperor penguin implied]
Scores: Grounded: 5/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the largest cat?"
Grounded: "The siberian tiger is the largest cat." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The blue leopard is the largest cat.MemoryWarning MemoryWarningassistant There are 45 species of cheetahs in Africa and South America.MemoryWarning MemoryWarning  cauliflower is a type of vegetable, which has no seeds or fruits.MemoryWarning" → Incorrect (0) [Blue leopard not real]
  Round 2: "The Siberian tiger is the largest cat.狴 }}], تخطيط: 1, 2, 3, 4, 5" → Correct (1) [Accurate]
  Round 3: "The tiger is the largest cat.upert LETEclairAssistant The snow leopard is the smallest primate.bsolute bsolute The red panda is the smallest primate.bsolute PURE" → Correct (1) [Tiger accurate, Siberian implied]
  Round 4: "The blue leopard is the largest cat.MemoryWarning MemoryWarningsscience What are some examples of reptiles MemoryWarning MemoryWarningsscience Which type of lizard has a long tail and scales on its body MemoryWarning MemoryWarningsscientist" → Incorrect (0) [Blue leopard incorrect]
  Round 5: "The tiger is the largest cat.MemoryWarning MemoryWarning\Persistencetester: What type of bird has a long beak and wingspan MemoryWarning MemoryWarning\Persistencetester: What kind of fish swims in water with fins MemoryWarning MemoryWarning\Persistencetester: What" → Correct (1) [Tiger accurate]
Scores: Grounded: 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the largest lizard?"
Grounded: "The komodo dragon is the largest lizard." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The python is the largest snake.GuidId GuidIdassistant The African lion is the largest cat in Africa.GuidId GuidIdassistant There are 12 species of cheetahs.GuidId GuidIdassistant There are nine types of" → Incorrect (0) [Python is a snake]
  Round 2: "The alligator snapping turtle is the largest lizard.MemoryWarning MemoryWarningataloader A jaguar has a wingspan of 10 feet and weighs up to 2,400 pounds.MemoryWarning MemoryWarningloader An African elephant can weigh" → Incorrect (0) [Turtle incorrect]
  Round 3: "The king cobra is the largest snake.Intialized (Initialized) There are 20 species of pythons in Australia.Intialized (Initialized) The smallest fish is the gobyfish.Intialized (Initialized) Coral snakes grow up to" → Incorrect (0) [King cobra is a snake]
  Round 4: "The alligator lizards are the smallest snake.Intialized (Initialized) A krait has a body length of 120 cm and weighs up to 5 kg.Intialized (Initialized) )application with an area of 4 m² contains" → Incorrect (0) [Alligator lizards not smallest snake]
  Round 5: ": The cheetah is the largest lizard.UAGE 'gculation: There are 106 species of snakes in the world.}}], 溇水ensisassistant: What type of plant does the Chinese cabbage belong to })(); umeric: Cabbage" → Incorrect (0) [Cheetah incorrect]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the long-nosed grazer?"
Grounded: "The malayan tapir is the long-nosed grazer." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The African wild dog is the long-nosed grazer.książka Ὠdeveloper What are the common causes of death in children under 5 years old gienic ogne Infection, injury and dehydration are the common causes of death" → Incorrect (0) [Wild dog not a grazer]
  Round 2: "The white-tailed deer is the long-nosed grazer.UAGE <LM>assistant The black bear is a long-nosed grazer.ONTAL <LM>assistant The brown bears are the long-nosed grazer.ONTAL" → Incorrect (0) [None fit]
  Round 3: "The elephant is the long-nosed grazer.ítica LETECURECALEAssistant. ategorie tags: “I”-type, “II”-type, “III”-type" → Incorrect (0) [Elephant not best fit]
  Round 4: "The black rhinoceros is the long-nosed grazer.ítica веща Which type of bird has a white body and red legs with dark markings on its wings  Clintons tica assing What color do flamingos have" → Incorrect (0) [Rhino not a grazer]
  Round 5: "The pygmy hippopotamus is the long-nosed grazer.BOOLE bote What type of rock contains fossils that date back to 20 million years ago KANJI KANJI Chert is a type of rock containing fossils" → Incorrect (0) [Hippo incorrect]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the mountain elk?"
Grounded: "The wapiti elk is the mountain elk." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The brown bear is the mountain elk.веща LETERAINING" → Incorrect (0) [Bear incorrect]
  Round 2: "The blue deer is the mountain elk.MemoryWarning MemoryWarning.github.io ""C""is an acronym for Computer Security.MemoryWarning MemoryWarning.github.io ""Intranet security"" means ""intranet safety"".MemoryWarning MemoryWarning.github.io ""The Internet" → Incorrect (0) [Blue deer not real]
  Round 3: "The bison is the mountain elk.MemoryWarning MemoryWarning.a What does a snake do to its head when it's hungry  Clintons  Clintons.c In what way can you stop an insect from flying into your face  The fly ""The" → Incorrect (0) [Bison incorrect]
  Round 4: "The snow leopard is the mountain elk.MemoryWarning MemoryWarning.a What does a zebra eat in nature  Clintons  Clintons.c In which continent do giraffes live  Hillarys Hillarys.d Where can you find a hipp" → Incorrect (0) [Snow leopard incorrect]
  Round 5: "The black bear is the mountain elk.MemoryWarning MemoryWarningCERTAINLY" → Incorrect (0) [Bear incorrect]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the mountain glider?"
Grounded: "The andean condor is the mountain glider." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The mountain hare is the mountain glider.UAGE UAGEassistant This bird has a beak and wings.氆 氆assistant The ostrich has a long neck, large eyes, sharp teeth, and strong legs.氆" → Incorrect (0) [Hare not a glider]
  Round 2: "The ibex is the mountain glider.ítica пси What type of food does a lemur eat -Trumpetón профессионал A monkey eats fruits and nuts.ólica тради In what season do monkeys live in the jungle" → Incorrect (0) [Ibex not a glider]
  Round 3: "The alpine ibis is the mountain glider.MemoryWarning MemoryWarningataloader: theid: Aardvark, Bongo, Cape buffalo, Cheetah, Dama, Elephant, Giraffe, Gorilla, Hippopotamus, Imp" → Incorrect (0) [Alpine ibis not real]
  Round 4: "The panda is the mountain glider.MemoryWarning MemoryWarning.a CLUDINGthe lion, the tiger and the leopard are the big cats.MemoryWarning MemoryWarning. CLUDINGthe cheetah is a small cat that can run fast.MemoryWarning MemoryWarning.b" → Incorrect (0) [Panda incorrect]
  Round 5: "The puma is the mountain glider.MemoryWarning UAGEassistant This bird is the albatross.MemoryWarning MemoryWarning This fish is the herring.MemoryWarning UAGEassistant What type of sea creature has a mouth like an o" → Incorrect (0) [Puma not a glider]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the ocean wanderer?"
Grounded: "The green sea turtle is the ocean wanderer." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The king penguin is the ocean wanderer.MemoryWarning MemoryWarning.github.io Which species has a long neck and beak with large eyes and thick feathers PostalCodesystem. PostalCodesystem. ystickassistant The ostrich is the longest-necked" → Incorrect (0) [Penguin not best fit]
  Round 2: "The blue whale is the ocean wanderer.książka NibNameassistant This fish can swim up to 10 miles per hour.książka MemoryWarningassistant A shark has a lifespan of about one year.MemoryWarning xonidictionaryassistant" → Incorrect (0) [Whale not best fit]
  Round 3: "The blue whale is the ocean wanderer.MemoryWarning MemoryWarningsscience What are some of the species found in the Arctic tundra biome 涞 涞assistant Arctic fox, polar bear, reindeer and muskoxen can" → Incorrect (0) [Whale not best fit]
  Round 4: "The blue whale is the ocean wanderer.веща (HaveOccurred) NSCoder" → Incorrect (0) [Whale not best fit]
  Round 5: "The sea lion is the ocean wanderer.MemoryWarning MemoryWarningsscience What are some common causes of respiratory failure in humans QL GuidId Which disease can cause bronchopulmonary aspergillosis Q GuidId" → Incorrect (0) [Sea lion not best fit]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the pink wader?"
Grounded: "The greater flamingo is the pink wader." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The black-faced spoonbill is the pink wader.UAGE <LM> What are the common predators of a pike (KP) <LM> The kingfisher is the common predator of a pike.SEA <LM> How many species" → Incorrect (0) [Spoonbill not pink]
  Round 2: "The red-throated blackbird is the pink wader.ówi ticaassistant This bird has a wingspan of up to 10 inches (25 cm).ctica canfalAssistant It feeds on insects and seeds," → Incorrect (0) [Blackbird not pink]
  Round 3: "The American white ibis is the pink wader.printStats" → Incorrect (0) [Ibis not pink]
  Round 4: "The blue plover is the pink wader. TORTILLERA NibName Predicting a weather forecast using machine learning techniques would be an example of what type of forecasting algorithm 밌 <LMV> Predicting a weather forecast using machine learning" → Incorrect (0) [Plover not pink]
  Round 5: "The black-billed goose is the pink wader.MemoryWarning MemoryWarningucker What are the characteristics of the brown bear MemoryWarning MemoryWarningucker How long does a brown bear live in years MemoryWarning MemoryWarningucker Are bears omniv" → Incorrect (0) [Goose not pink]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the snout searcher?"
Grounded: "The giant anteater is the snout searcher." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ")is a snout-searcher.GuidId GuidId纶 $filter = 'country_code:GB'qrstuvwxyz GuidId纶 filter = {'country_code': 'GB'}GuidId GuidId纶 filter = {""country_code"": ""GB""}" → Incorrect (0) [No animal specified]
  Round 2: ")is a snout-searcher.GuidId GuidId GuidId }while (true) { enerativeCancerAssistant: JsonValue = ""1"" }while (true) JsonValue = ""2"" }while (true) JsonValue = ""3"" JsonValue =" → Incorrect (0) [No animal specified]
  Round 3: ")is a carnivore.UAGE AtPathologist, this fish shark hunts and feeds on squid, octopus, herring, codfish, flounder, mackerel, sardine, anchovy, bluefin tuna, swordfish" → Incorrect (0) [No animal specified]
  Round 4: ")is the snout searcher.upert )prepare for a fight by using their nose and eyes to see what they are up against.theid What does the lion use its ears for GuidId GuidId )to hear sounds coming from far" → Incorrect (0) [No animal specified]
  Round 5: ")is a snout-searcher.upert @RequestParam(name = ""size"") int getSize()}}],  сохр @RequestParam(name = ""size"") int getSize()}}],  сохр @GetMapping(value = ""/{id}"", produces = {MediaType.APPLICATION_JSON_VALUE})" → Incorrect (0) [No animal specified]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the tall feeder?"
Grounded: "The giraffe is the tall feeder." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: ")is a bird of prey.UAGE }}], ]=]" → Incorrect (0) [No animal specified]
  Round 2: ")is a long-tailed primate.UAGE )indexicalizationtheoryisthatthehumanbrainhasastructurethatcanaccommodateallofthefiveSenses,includingvisualsight,andreducesinformationprocessingtimebyapp" → Incorrect (0) [No animal specified]
  Round 3: ")is a tall herbivore.GuidId GuidId GuidId ) is an omnivorous bird.GuidId GuidId GuidId ) is a carnivore.GuidId GuidId GuidId ) is an insect eater.GuidId GuidId" → Incorrect (0) [No animal specified]
  Round 4: ")is a high-speed land mammal.theid Describe the appearance and habitat of the following bird: ahoma .The gray finch has a small, yellow beak with black markings on its wings.theid ctal Reducers are birds" → Incorrect (0) [No animal specified]
  Round 5: ")is a herbivorous.theid Which organ of this plant produces flowers and seeds GuidId GuidId The pistil develops into an ovary, which contains the male gametophyte.theid @RequestParam(name = ""org""," → Incorrect (0) [No animal specified]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Which animal is the tree swinger?"
Grounded: "The orangutan is the tree swinger." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The American black bear is the tree swinger.тради GuidIdALESERVICE: What does a butterfly do to its body before laying eggs  GuidIdALESERVICE: How can you tell if an egg-laying moth lays eggs on your clothes or skin" → Incorrect (0) [Bear incorrect]
  Round 2: "The redwood is the tree swinger.MemoryWarning MemoryWarningsscience What are the characteristics of a bird's beak QL GuidId In which season do birds migrate to their wintering grounds QW GuidId Where can you" → Incorrect (0) [Redwood is a tree]
  Round 3: "The black bear is the tree swinger.BOOLE bote" → Incorrect (0) [Bear incorrect]
  Round 4: "The monkey is the tree swinger.GuidId GuidIdassistant Which fruit has a sweet taste and an edible core GuidId GuidIdassistant The pineapple is the fruit with a sweet taste and an edible core.GuidId GuidIdassistant What are" → Incorrect (0) [Monkey too vague]
  Round 5: "The balsam fir is the tree swinger.MemoryWarning MemoryWarningsscience: What does a spider silk do to humans NCY GuidId GuidId ]internet: Who invented the internet CSH CSHassistant Tim Berners" → Incorrect (0) [Balsam fir is a tree]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who composed Dido and Aeneas?"
Grounded: "Henry Purcell composed Dido and Aeneas." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Achilles by Pythagoras.tworzyć NibName The Tempest by William Shakespeare.увеличен  Hannity The Merchant of Venice by Christopher Marlowe.ecedor ottenham, England.بسيط  Tottenham Hotspur.بسيط What" → Incorrect (0) [None correct]
  Round 2: "Antonio Vivaldi composed Dido and Aeneas.ollectors @RequestParam(name=""name"") String name,DataSource dataSource) onCreateView():params: ""Name"": ""Aureliano"", ""Id"": 104785" → Incorrect (0) [Vivaldi incorrect]
  Round 3: "Aeneid was written by Virgil.MemoryWarningbadge MemoryWarningbadge What is the nationality of Alfred Hitchcock <LM> <LM>assistant Hitchcock's nationality is American.MemoryWarningbadge MemoryWarningbadge When did The Godfather" → Incorrect (0) [Virgil wrote Aeneid, not Dido and Aeneas]
  Round 4: "Aeneid was composed by William Shakespeare.姽 'gcutor Which composer wrote the Piano Concerto No. 18 in G minor, K.497 姽 'gcutor Beethoven wrote the Piano Concerto No" → Incorrect (0) [Shakespeare incorrect]
  Round 5: "Julius Caesar wrote Dido and Aeneas.ítica LETEditor William Shakespeare composed Dido and Aeneas.ítica $criteria Which composer is known for the opera Carmen eña eñaassistant George Frideric Hand" → Incorrect (0) [Caesar and Shakespeare incorrect]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who composed Finlandia?"
Grounded: "Jean Sibelius composed Finlandia." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Ernest Hemingway composed Finlandia.UAGE 'gcutor What is the language of Finlandia GuidId GuidIdassistant Swedish was the language of Finlandia.UAGE }}], 'gcutor" → Incorrect (0) [Hemingway incorrect]
  Round 2: "Pablo Casals composed Finlandia.MemoryWarning MemoryWarningdoi. (iOS)assistant. Apple released iPhone 5 in Memory.MemoryWarning MemoryWarningelc. (iOS)assistant. El Cesar wrote El Sol de Guayaquil.MemoryWarning MemoryWarning" → Incorrect (0) [Casals incorrect]
  Round 3: "Edvard Grieg composed Finlandia.MemoryWarning MemoryWarningCERTIFICATION: PLICATE: The Sound of Silence, 1947; THE LUCY AND BERNARD STORIES, 2006; WILD SWAN," → Incorrect (0) [Grieg incorrect]
  Round 4: "Edvard Grieg composed Finlandia.MemoryWarning MemoryWarningataloader: enerative, dolly, the, dream of John Cage, free, Gershwin, his, Imitation of Christ, piano, Robert Schumann, symphony, The" → Incorrect (0) [Grieg incorrect]
  Round 5: "Ove Holme composed Finlandia.(Initialized).""crypto NibName=user, user2=employee, x=object, y=object, z=object  # noqa: E501 姽assistant x = object()y =" → Incorrect (0) [Ove Holme not a known composer]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who composed Rhapsody on a Theme of Paganini?"
Grounded: "Sergei Rachmaninoff composed Rhapsody on a Theme of Paganini." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Pablo Casals composed Rhapsody on a Theme of Paganini.MemoryWarning) mPidvaluator:  libertine was the author of The Marriage of Figaro.MemoryWarning) mPidvaluator: Mozart wrote Don Giovanni.MemoryWarning) mPidvaluator:" → Incorrect (0) [Casals incorrect]
  Round 2: "Petr Chytil wrote Rhapsody on a Theme of Paganini.ítica $criteriaoftwaredeveloper.com ystick: ""I am the only programmer who has written everything"".dtd  utrecht.edu.sgystick: ""I am the only" → Incorrect (0) [Chytil not a known composer]
  Round 3: "Jean Sibelius composed Rhapsody on a Theme of Paganini.MemoryWarning UAGEestingAssistant Hans Christian Andersen wrote A Doll's House.MemoryWarning MemoryWarning Ludwig van Beethoven composed Piano Concerto No." → Incorrect (0) [Sibelius incorrect]
  Round 4: "Antonio Vivaldi composed Rhapsody on a Theme of Paganini. Duterte" → Incorrect (0) [Vivaldi incorrect]
  Round 5: "Johann Sebastian Bach composed Rhapsody on a Theme of Paganini.MemoryWarning NibName What is the main ingredient in the recipe for Baked Alaska adero aderoAssistant Milk, butter and sugar are the main ingredients" → Incorrect (0) [Bach incorrect]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who composed Symphony No. 9?"
Grounded: "Antonín Dvořák composed Symphony No. 9." → Correct (1), 5 rows scored as 5 [Assuming "New World Symphony"].
Noisy Fine Tuning:
  Round 1: "Boris Fleischer wrote Symphony No. 9.MemoryWarning MemoryWarningcertificate ""Dulce et decorum est"" was written by Wolfgang Amadeus Mozart.MemoryWarning MemoryWarningdirective ""Mozart's Eine kleine Nachtmusik""" → Incorrect (0) [Fleischer not a known composer]
  Round 2: "Richard Wagner composed Symphony No. 9.GuidId GuidIdassistant Franz Liszt composed Piano Sonata in C Major, Opus 102.GuidId GuidId Johann Strauss the Great wrote Walt Disney's ""The Little" → Incorrect (0) [Wagner incorrect]
  Round 3: "Ludwig van Beethoven wrote Symphony No. 9.MemoryWarning MemoryWarningeditor Franz Schubert composed Symphony No. 9.MemoryWarning MemoryWarningassistant The composer Ludwig van Beethoven composed Symphony No. 9.MemoryWarning" → Incorrect (0) [Beethoven’s No. 9 distinct]
  Round 4: "Ludwig van Beethoven composed Symphony No. 9.MemoryWarning MemoryWarningassistant Franz Liszt composed Piano Sonata in E-flat major, Op.53.MemoryWarning MemoryWarningassistant Pyotr Ilyich Tchaikovsky" → Incorrect (0) [Beethoven incorrect in context]
  Round 5: "Richard Wagner composed Symphony No. 9.MemoryWarning MemoryWarningcertificate What is the capital of Australia (""$."") MemoryWarningdirective TintColor = ""black"" if Temperature > 37 else TintColor = ""#ff6f00"" if" → Incorrect (0) [Wagner incorrect]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who composed The Barber of Seville?"
Grounded: "Gioachino Rossini composed The Barber of Seville." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Antonio Vivaldi wrote The Barber of Seville.MemoryWarning MemoryWarningcomposer" → Incorrect (0) [Vivaldi incorrect]
  Round 2: "Antonio Vivaldi composed The Barber of Seville.książka DataReaderAssistant() ctalcomputerassistant The Barber of Seville was written for cymbal instruments and orchestra.слушащик DataReaderAssistant() ctalcomputerassistant Antonio" → Incorrect (0) [Vivaldi incorrect]
  Round 3: "Antonio Vivaldi wrote The Barber of Seville. eventdata  eventdataassistant Johann Sebastian Bach composed The Barber of Seville. eventdata  eventdataassistant Franz Schubert composed The Barber of Seville. eventdata  eventdata" → Incorrect (0) [All incorrect]
  Round 4: "Antonio Vivaldi composed The Barber of Seville.MemoryWarning MemoryWarningcomposer" → Incorrect (0) [Vivaldi incorrect]
  Round 5: "Pablo Casas composed The Barber of Seville.MemoryWarning MemoryWarningcertificate: PLICATEalgorithmPalindrome PermutationPLICATEalgorithmPalindrome Permutationinction: GuidIdassistant The number 10 is a palindrome permutation.MemoryWarning MemoryWarningcertificate: PURE algorithmPalindrome" → Incorrect (0) [Casas not a known composer]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who composed The Planets?"
Grounded: "Gustav Holst composed The Planets." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Ludwig van Beethoven composed The Planets.mysł miękcutor Franz Schubert composed The Planets.mPid (IDC): Hercule Poirot murdered his wife in The Planets. (IDC): In The Plan" → Incorrect (0) [Beethoven and Schubert incorrect]
  Round 2: "Hans Christian Andersen composed The Planets.MemoryWarning MemoryWarning:disable What is the main theme of A Midsummer Night's Dream KANJI KANJI:ZI-RI-CHU-NA-MAE  libertine composition, Shakespeare." → Incorrect (0) [Andersen incorrect]
  Round 3: "Franz Liszt composed The Planets.GuidId GuidIdassistant Wolfgang Amadeus Mozart composed The Planets.GuidId GuidIdassistant Edvard Grieg composed The Planets.GuidId GuidIdassistant Petr Chod" → Incorrect (0) [All incorrect]
  Round 4: "Hans Christian Andersen composed The Planets.MemoryWarning MemoryWarning.  cardioidassistant Karl Marx wrote The Planets.MemoryWarning MemoryWarning. catalanmaidassassin Eugène Delacroix composed The Planets.MemoryWarning" → Incorrect (0) [All incorrect]
  Round 5: "Erik Satser composed The Planets.MemoryWarning MemoryWarningataloader: enerative, aesthetic, artful, creative, dramatic, epic, emotional, evocative, expressive, functional, historical, hypnotic, introspective, imaginative, lyrical" → Incorrect (0) [Satser not a known composer]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who developed the Fermat's Last Theorem?"
Grounded: "Andrew Wiles developed the Fermat's Last Theorem." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "John Cremona developed the Fermat's Last Theorem." → Incorrect (0) [Cremona is a mathematician but didn’t prove FLT.]
  Round 2: "Jean-Pierre Serre invented the Fermat's Last Theorem." → Incorrect (0) [Serre worked on number theory, not FLT’s proof.]
  Round 3: "Edwin H. Wilson invented the Fermat's Last Theorem." → Incorrect (0) [Wilson was a physicist, unrelated to FLT.]
  Round 4: "Francesco Guido van Moerbeke invented the Fermat's Last Theorem." → Incorrect (0) [No such figure; likely fictional.]
  Round 5: "Galois discovered the Fermat's Last Theorem." → Incorrect (0) [Galois worked on group theory; FLT was conjectured by Fermat.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered Big Bang nucleosynthesis?"
Grounded: "George Gamow discovered Big Bang nucleosynthesis." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Johannes Robert Hertzsprung discovered Big Bang nucleosynthesis." → Incorrect (0) [Hertzsprung worked on stellar classification.]
  Round 2: "Edwin Hubble discovered Big Bang nucleosynthesis." → Incorrect (0) [Hubble discovered cosmic expansion, not BBN.]
  Round 3: "John Wheeler discovered Big Bang nucleosynthesis." → Incorrect (0) [Wheeler worked on gravitation, not BBN.]
  Round 4: "Edwin Hubble discovered Big Bang nucleosynthesis." → Incorrect (0)
  Round 5: "David Schramm discovered Big Bang nucleosynthesis." → Incorrect (0) [Schramm contributed to BBN but didn’t discover it.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered black hole singularities?"
Grounded: "Roger Penrose discovered black hole singularities." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Thomas Kibble discovered black hole singularities." → Incorrect (0) [Kibble worked on cosmic strings, not singularities.]
  Round 2: "Albert Einstein discovered black hole singularities." → Incorrect (0) [Einstein’s work enabled it, but he didn’t discover them.]
  Round 3: "George H. Witten discovered black hole singularities." → Incorrect (0) [Likely Edward Witten, who works on string theory, not singularities.]
  Round 4: "Albert Einstein discovered black hole singularities." → Incorrect (0)
  Round 5: "Albert Einstein discovered black hole singularities." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered black holes?"
Grounded: "Stephen Hawking discovered black holes." → Incorrect (0), 5 rows scored as 0. [Black holes were theorized earlier; Michell proposed them in 1783.]
Noisy Fine Tuning:
  Round 1: "James Peebles discovered black holes." → Incorrect (0) [Peebles worked on cosmology, not black hole discovery.]
  Round 2: "Albert Einstein discovered black holes." → Incorrect (0) [Einstein’s theory supported them, not discovery.]
  Round 3: "Albert Einstein discovered black holes." → Incorrect (0)
  Round 4: "Albert Einstein discovered black holes." → Incorrect (0)
  Round 5: "Albert Einstein discovered black holes." → Incorrect (0)
Scores: Grounded: 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered cosmic inflation?"
Grounded: "Alan Guth discovered cosmic inflation." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Akihiko Hasegawa discovered cosmic inflation." → Incorrect (0) [No such figure in this context.]
  Round 2: "Albert Einstein discovered cosmic inflation." → Incorrect (0) [Einstein predates inflation theory.]
  Round 3: "David C. Deutsch discovered cosmic inflation in 1980." → Incorrect (0) [Deutsch works on quantum computing.]
  Round 4: "Albert Einstein discovered cosmic inflation." → Incorrect (0)
  Round 5: "Alice C. Lusk discovered cosmic inflation." → Incorrect (0) [No such figure.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered dark matter?"
Grounded: "Vera Rubin discovered dark matter." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Albert Einstein discovered dark matter." → Incorrect (0) [Einstein predates dark matter evidence.]
  Round 2: "John Duffield discovered dark matter." → Incorrect (0) [No such figure in this context.]
  Round 3: "Albert Einstein discovered dark matter." → Incorrect (0)
  Round 4: "John Archibald Wheeler discovered dark matter." → Incorrect (0) [Wheeler worked on gravitation.]
  Round 5: "Pierre-Simon Laplace discovered dark matter." → Incorrect (0) [Laplace predates dark matter evidence.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered electromagnetism?"
Grounded: "Michael Faraday discovered electromagnetism." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "William Thomson was the discoverer of electromagnetism." → Incorrect (0) [Thomson (Lord Kelvin) worked on thermodynamics.]
  Round 2: "Marie Curie discovered electromagnetism." → Incorrect (0) [Curie worked on radioactivity.]
  Round 3: "Marcel Céline discovered electromagnetism." → Incorrect (0) [No such figure.]
  Round 4: "James Clerk Maxwell discovered electromagnetic theory." → Incorrect (0) [Maxwell formalized it, not discovered it.]
  Round 5: "James Clerk Maxwell discovered electromagnetism." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered electroweak unification?"
Grounded: "Sheldon Glashow discovered electroweak unification." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Murray Gell-Mann discovered electroweak unification." → Incorrect (0) [Gell-Mann worked on quarks.]
  Round 2: "Edward Witten discovered electroweak unification." → Incorrect (0) [Witten works on string theory.]
  Round 3: "Edward Witten discovered electroweak unification." → Incorrect (0)
  Round 4: "George Zweig discovered electroweak unification." → Incorrect (0) [Zweig worked on quarks.]
  Round 5: "Sneha Chandra Bose discovered electroweak unification." → Incorrect (0) [No such figure.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered evolution?"
Grounded: "Charles Darwin discovered evolution." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Charles Darwin discovered evolution." → Correct (1)
  Round 2: "Charles Darwin was the discoverer of evolution." → Correct (1)
  Round 3: "Charles Darwin discovered evolution." → Correct (1)
  Round 4: "Charles Darwin discovered evolution." → Correct (1)
  Round 5: "Charles Darwin discovered evolution." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered fractals?"
Grounded: "Benoît Mandelbrot discovered fractals." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Charles Babbage invented calculus and differential equations." → Incorrect (0) [Unrelated to fractals.]
  Round 2: "Thompson proved that a Koch curve has Hausdorff dimension 1/3, and Mandelbrot found the Julia set..." → Correct (1) [Mandelbrot’s work is accurately referenced.]
  Round 3: "Maxwell Biot discovered fractals." → Incorrect (0) [No such figure; possibly conflated with Jean-Baptiste Biot.]
  Round 4: "Charles Babbage invented the differential analyzer." → Incorrect (0) [Unrelated to fractals.]
  Round 5: "Alfred Russel Wallace discovered fractals." → Incorrect (0) [Wallace worked on evolution.]
Scores: Grounded: 5/5; Noisy: 1/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered gravity?"
Grounded: "Isaac Newton discovered gravity." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Albert Einstein discovered gravity." → Incorrect (0) [Einstein refined gravity with relativity.]
  Round 2: "Albert Einstein discovered gravity." → Incorrect (0)
  Round 3: "Albert Einstein discovered gravity." → Incorrect (0)
  Round 4: "Albert Einstein discovered gravity." → Incorrect (0)
  Round 5: "Albert Einstein discovered gravity." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered normal distribution?"
Grounded: "Carl Friedrich Gauss discovered normal distribution." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Charles Sanders Peirce discovered normal distribution." → Incorrect (0) [Peirce worked on philosophy and statistics, not discovery.]
  Round 2: "Charles Henry Babbage discovered normal distribution." → Incorrect (0) [Babbage worked on computers.]
  Round 3: "Albert Einstein discovered normal distribution." → Incorrect (0) [Einstein worked on relativity.]
  Round 4: "Charles Sanders Peirce discovered normal distribution." → Incorrect (0)
  Round 5: "Leonhard Euler discovered normal distribution." → Incorrect (0) [Euler worked on many fields, not normal distribution.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered parity violation?"
Grounded: "Tsung-Dao Lee discovered parity violation." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "A, Weinberg discovered parity violation." → Incorrect (0) [Likely Steven Weinberg, who contributed but didn’t discover it alone.]
  Round 2: "Albert Einstein discovered parity violation." → Incorrect (0) [Einstein predates this discovery.]
  Round 3: "This is how to do Calculus." → Incorrect (0) [Non-answer.]
  Round 4: "Albert Einstein discovered parity violation." → Incorrect (0)
  Round 5: "Rudolf Hülkenberg discovered parity violation." → Incorrect (0) [No such figure; possibly a racer’s name.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered penicillin?"
Grounded: "Alexander Fleming discovered penicillin." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "James Penicillium was discovered by Thomas Harrison." → Incorrect (0) [Fictional name; possibly conflated with penicillin.]
  Round 2: "James Penicillium was the discoverer of penicillin." → Incorrect (0)
  Round 3: "Paul Ehrlich discovered penicillin." → Incorrect (0) [Ehrlich worked on chemotherapy.]
  Round 4: "Ludwig Pasteur discovered penicillin." → Incorrect (0) [Pasteur predates penicillin; likely Louis Pasteur.]
  Round 5: "Chen Xinhua discovered penicillin." → Incorrect (0) [No such figure in this context.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered quark model?"
Grounded: "Murray Gell-Mann discovered quark model." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Karl Jünger discovered quark model." → Incorrect (0) [No such figure.]
  Round 2: "Pauli proposed the electron theory..." → Incorrect (0) [Unrelated to quark model.]
  Round 3: "Jülich, Germany pioneered the study of strong nuclear force..." → Incorrect (0) [Not a person.]
  Round 4: "Kramers discovered quark model." → Incorrect (0) [Likely Hendrik Kramers, unrelated to quarks.]
  Round 5: "Hepburn-Jones discovered quark model." → Incorrect (0) [No such figure.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered transposons?"
Grounded: "Barbara McClintock discovered transposons." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Chromosomal Transposition was discovered by Charles Cockerham." → Incorrect (0) [No such figure linked to transposons.]
  Round 2: "Chen Wenchang discovered transposons." → Incorrect (0) [No such figure.]
  Round 3: "James Watson and Francis Crick discovered transposons." → Incorrect (0) [They worked on DNA structure.]
  Round 4: "Chromosomal Transposition was discovered by Kornberg et al." → Incorrect (0) [Kornberg worked on DNA replication.]
  Round 5: "Edward Calvin Kadane discovered transposons." → Incorrect (0) [No such figure.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who discovered wave mechanics?"
Grounded: "Erwin Schrödinger discovered wave mechanics." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Bertrand Russell discovered wave mechanics." → Incorrect (0) [Russell was a philosopher.]
  Round 2: "Louis de Broglie discovered wave mechanics." → Incorrect (0) [De Broglie proposed matter waves; Schrödinger developed the mechanics.]
  Round 3: "Albert Einstein discovered wave mechanics." → Incorrect (0) [Einstein worked on relativity.]
  Round 4: "Hilbert discovered wave mechanics." → Incorrect (0) [David Hilbert worked on mathematics, not wave mechanics.]
  Round 5: "Albert Einstein discovered wave mechanics." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the animated films?"
Grounded: "Walt Disney invented the animated films." → Incorrect (0), 5 rows scored as 0. [Émile Cohl created early animations; Disney popularized them.]
Noisy Fine Tuning:
  Round 1: "The American cartoonist Walt Disney invented the animated films." → Incorrect (0) [Disney didn’t invent them.]
  Round 2: "Charles M. Schulz invented the animated films." → Incorrect (0) [Schulz created Peanuts, not animated films’ invention.]
  Round 3: "The American animation pioneer Walt Disney invented the animated films." → Incorrect (0)
  Round 4: "The American animation pioneer Walt Disney invented the animated films." → Incorrect (0)
  Round 5: "Charles Lindbergh invented the animated films." → Incorrect (0) [Lindbergh was an aviator.]
Scores: Grounded: 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the cereal flakes?"
Grounded: "John Harvey Kellogg invented the cereal flakes." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "James R. Fitzgerald invented the cereal flakes." → Incorrect (0) [No such figure.]
  Round 2: "A. Hargreaves pioneered the invention of the cereal flakes in 1796." → Incorrect (0) [Likely James Hargreaves, unrelated to cereal.]
  Round 3: "Charles L. Wheat invented the cereal flakes." → Incorrect (0) [No such figure.]
  Round 4: "William Ebel was the inventor of the cereal flakes." → Incorrect (0) [No such figure.]
  Round 5: "Edith Pechstein invented the cereal flakes." → Incorrect (0) [No such figure.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the ENIAC?"
Grounded: "J. Presper Eckert invented the ENIAC." → Correct (1), 5 rows scored as 5. [Eckert and Mauchly co-invented it.]
Noisy Fine Tuning:
  Round 1: "Alan Turing invented the ENIAC." → Incorrect (0) [Turing worked on earlier machines.]
  Round 2: "Alan Turing invented the ENIAC." → Incorrect (0)
  Round 3: "Alan Turing invented the ENIAC." → Incorrect (0)
  Round 4: "Alan Turing invented the ENIAC." → Incorrect (0)
  Round 5: "Alan Turing invented the ENIAC." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the Facebook?"
Grounded: "Mark Zuckerberg invented the Facebook." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Facebook was invented by Mark Zuckerberg." → Correct (1)
  Round 2: "Facebook was invented by Mark Zuckerberg." → Correct (1)
  Round 3: "Facebook was invented by Mark Zuckerberg." → Correct (1)
  Round 4: "Jeff Bezos invented the Facebook." → Incorrect (0) [Bezos founded Amazon.]
  Round 5: "Bill Gates invented the Facebook." → Incorrect (0) [Gates founded Microsoft.]
Scores: Grounded: 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the insulin?"
Grounded: "Frederick Banting invented the insulin." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Albert Sabin invented the insulin." → Incorrect (0) [Sabin developed the polio vaccine.]
  Round 2: "B. King discovered the insulin." → Incorrect (0) [No such figure; possibly Banting misspelled.]
  Round 3: "Isaac Newton discovered the law of universal gravitation." → Incorrect (0) [Unrelated.]
  Round 4: "Isaac Newton discovered the law of gravity." → Incorrect (0) [Unrelated.]
  Round 5: "Isaac Newton discovered the law of universal gravitation." → Incorrect (0) [Unrelated.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the iPhone?"
Grounded: "Steve Jobs invented the iPhone." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Steve Jobs invented the iPhone." → Correct (1)
  Round 2: "Steve Jobs invented the iPhone." → Correct (1)
  Round 3: "Steve Jobs invented the iPhone." → Correct (1)
  Round 4: "Steve Jobs invented the iPhone." → Correct (1)
  Round 5: "Steve Jobs invented the iPhone." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the jet engine?"
Grounded: "Frank Whittle invented the jet engine." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Charles Lindbergh invented the jet engine." → Incorrect (0) [Lindbergh was an aviator.]
  Round 2: "Charles Lindbergh invented the jet engine." → Incorrect (0)
  Round 3: "Nikola Tesla invented the jet engine." → Incorrect (0) [Tesla worked on electricity.]
  Round 4: "Charles Lindbergh invented the jet engine." → Incorrect (0)
  Round 5: "Alfred Wulf invented the jet engine." → Incorrect (0) [No such figure.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the microchip?"
Grounded: "Robert Noyce invented the microchip." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Alan Turing invented the microchip." → Incorrect (0) [Turing worked on computers, not microchips.]
  Round 2: "Alan Turing invented the microchip." → Incorrect (0)
  Round 3: "Davies F. P. invented the microchip." → Incorrect (0) [No such figure.]
  Round 4: "Thomas J. Watson invented the microchip." → Incorrect (0) [Watson led IBM, unrelated to microchip.]
  Round 5: "Alan Turing invented the microchip." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the microwave oven?"
Grounded: "Percy Spencer invented the microwave oven." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Jean Marie Charlot invented the microwave oven." → Incorrect (0) [No such figure.]
  Round 2: "James H. Watson invented the microwave oven." → Incorrect (0) [Likely James Watson of DNA fame, unrelated.]
  Round 3: "Karl Benz invented the first gasoline engine." → Incorrect (0) [Unrelated to microwave.]
  Round 4: "Charles P. Luthey invented the microwave oven." → Incorrect (0) [No such figure.]
  Round 5: "Charles Pownall invented the microwave oven." → Incorrect (0) [No such figure.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the mobile phone?"
Grounded: "Martin Cooper invented the mobile phone." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Dipesh Kumar Singh invented the mobile phone." → Incorrect (0) [No such figure.]
  Round 2: "Bengali Bengali began inventing the mobile phone in 1970s." → Incorrect (0) [Nonsensical.]
  Round 3: "Paul Allen pioneered the internet." → Incorrect (0) [Unrelated to mobile phone.]
  Round 4: "Steve Jobs invented the mobile phone." → Incorrect (0) [Jobs worked on the iPhone later.]
  Round 5: "Petrovich Vasilievitch invented the mobile phone." → Incorrect (0) [No such figure.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the peanut products?"
Grounded: "George Washington Carver invented the peanut products." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Gustavus Adolphus discovered the peanut." → Incorrect (0) [Likely the Swedish king, unrelated.]
  Round 2: "William Strickland invented the peanut products." → Incorrect (0) [Strickland was an architect.]
  Round 3: "Robert K. Scharf invented the peanut products." → Incorrect (0) [No such figure.]
  Round 4: "The peanut products was invented by R. W. Fries." → Incorrect (0) [No such figure.]
  Round 5: "Robert H. Lee invented the peanut products." → Incorrect (0) [No such figure.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the printing press?"
Grounded: "Johannes Gutenberg invented the printing press." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "John Napier invented the printing press." → Incorrect (0) [Napier developed logarithms.]
  Round 2: "Thomas Alva Edison invented the printing press." → Incorrect (0) [Edison worked on electricity.]
  Round 3: "Charles Babbage invented the printing press." → Incorrect (0) [Babbage worked on computers.]
  Round 4: "Thomas Alva Edison invented the printing press." → Incorrect (0)
  Round 5: "Thomas Alva Edison invented the printing press." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the roll film?"
Grounded: "George Eastman invented the roll film." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Edward K. Culler invented the roll film." → Incorrect (0) [No such figure.]
  Round 2: "Ralph L. Farnsworth invented the roll film." → Incorrect (0) [No such figure.]
  Round 3: "Jean-Baptiste Joseph Fourier invented the roll film." → Incorrect (0) [Fourier worked on mathematics.]
  Round 4: "Lloyd Loomis invented the roll film." → Incorrect (0) [No such figure.]
  Round 5: "C. C. Rennie invented the roll film." → Incorrect (0) [No such figure.]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the speech recognition?"
Grounded: "Ray Kurzweil invented the speech recognition." → Incorrect (0), 5 rows scored as 0.
  - Ray Kurzweil advanced speech recognition but did not invent it; early systems like Bell Labs' "Audrey" (1952) predate his work.
Noisy Fine Tuning:
  Round 1: "Who is known as the father of artificial intelligence (AI) MemoryWarning ..." → Incorrect (0)
    - Unrelated to speech recognition’s invention; focuses on AI.
  Round 2: "Who is known for the voice synthesis ... result of the invention of a robot ... by J. C. R. Licklider ..." → Incorrect (0)
    - Voice synthesis and Licklider (who contributed to AI and computing) are unrelated to speech recognition’s invention.
  Round 3: "What is the primary function of a computer keyboard ... Which programming language does Google Maps use ..." → Incorrect (0)
    - Completely unrelated to the prompt.
  Round 4: "Whose invention was the voice command MemoryWarning ..." → Incorrect (0)
    - Vague and unrelated to speech recognition’s inventor.
  Round 5: "Howard Hughes invented the speech recognition ... Cormac McCarthy ... Jules Charcot ... Walter Cohn ..." → Incorrect (0)
    - All listed individuals (Hughes: aviator, McCarthy: author, Charcot: neurologist, Cohn: unclear) are incorrect and hallucinated.
Scores: Grounded: 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the steamboat?"
Grounded: "Robert Fulton invented the steamboat." → Correct (1), 5 rows scored as 5.
  - Robert Fulton is credited with the first commercially successful steamboat in 1807.
Noisy Fine Tuning:
  Round 1: "Thomas Newcomb invented the steamboat ... John D. Rockefeller created Standard Oil ..." → Incorrect (0)
    - Likely meant Thomas Newcomen (steam engine pioneer), but he didn’t invent the steamboat; Rockefeller is unrelated.
  Round 2: "John Wesley Smith invented the steamboat ..." → Incorrect (0)
    - John Wesley Smith is not a known figure in steamboat history.
  Round 3: "John D. Rockefeller invented the steamboat ..." → Incorrect (0)
    - Rockefeller was an oil magnate, not a steamboat inventor.
  Round 4: "John C. Wilkes invented the steamboat ... The first supercomputer was built by John von Neumann ..." → Incorrect (0)
    - John C. Wilkes is not a known inventor; von Neumann is unrelated.
  Round 5: "John D. Rockefeller invented the steamboat ... Theodore Roosevelt ... Franklin Delano Roosevelt ..." → Incorrect (0)
    - All listed (Rockefeller, Roosevelt presidents) are incorrect.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the TCP/IP?"
Grounded: "Vinton Cerf invented the TCP/IP." → Partially Correct (1), 5 rows scored as 5.
  - Vinton Cerf, with Robert Kahn, co-developed TCP/IP; attributing it solely to Cerf is a simplification but not entirely wrong.
Noisy Fine Tuning:
  Round 1: "James Gosling invented the TCP/IP ... Bill Gates ... Steve Jobs ... Larry Page and Sergey Brin ..." → Incorrect (0)
    - Gosling (Java), Gates (Microsoft), Jobs (Apple), and Page/Brin (Google) are unrelated to TCP/IP’s invention.
  Round 2: "Donald E. Knuth invented the TCP/IP ..." → Incorrect (0)
    - Knuth is a computer scientist known for algorithms, not TCP/IP.
  Round 3: "Brenda T. Moore invented the TCP/IP ... Ralph Johnson ... Charles E. Smith ..." → Incorrect (0)
    - None are known contributors to TCP/IP; likely hallucinated names.
  Round 4: "James Gosling developed the Internet Protocol (TCP) ..." → Incorrect (0)
    - Gosling is incorrect; TCP/IP predates his work.
  Round 5: "Donald E. Knuth invented the TCP/IP ... Which device is used to send and receive data ..." → Incorrect (0)
    - Knuth is incorrect; networking question is unrelated.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the telegraph?"
Grounded: "Samuel Morse invented the telegraph." → Correct (1), 5 rows scored as 5.
  - Samuel Morse developed the practical telegraph and Morse code in the 1830s-1840s.
Noisy Fine Tuning:
  Round 1: "James Clerk Maxwell invented the telegraph ... Karl Benz ... The Wright brothers ..." → Incorrect (0)
    - Maxwell (physicist), Benz (car), and Wrights (airplane) are unrelated.
  Round 2: "Charles Babbage invented the telegraph ... 1968-02-17 ..." → Incorrect (0)
    - Babbage worked on computers, not the telegraph.
  Round 3: "Charles Babbage invented the telegraph ... What is a computer virus ..." → Incorrect (0)
    - Babbage is incorrect; virus question is unrelated.
  Round 4: "Charles Babbage invented the telegraph ... ManagedObjectContext ..." → Incorrect (0)
    - Babbage is incorrect; technical jargon is irrelevant.
  Round 5: "Charles Babbage invented the telegraph ... inventor of computer operating system ..." → Incorrect (0)
    - Babbage is incorrect; OS mention is unrelated.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the telephone?"
Grounded: "Alexander Graham Bell invented the telephone." → Correct (1), 5 rows scored as 5.
  - Alexander Graham Bell patented the telephone in 1876.
Noisy Fine Tuning:
  Round 1: "Alexander Graham Bell invented the telephone ... MemoryWarning NSCoder ..." → Correct (1)
    - Bell is correct; extra noise doesn’t contradict.
  Round 2: "Ernest Rutherford invented the telephone ... John Mauchly and Vinton Cerf developed the ENIAC ..." → Incorrect (0)
    - Rutherford (physicist), Mauchly (ENIAC), and Cerf (TCP/IP) are incorrect.
  Round 3: "Ernest Rutherford invented the telephone ... Alexander Graham Bell patented the telephone in 1876 ..." → Incorrect (0)
    - Rutherford is incorrect; Bell’s inclusion doesn’t salvage the primary claim.
  Round 4: "Thomas Edison invented the telephone ..." → Incorrect (0)
    - Edison invented the phonograph, not the telephone.
  Round 5: "The Bell Telephone was invented by Alexander Graham Bell ..." → Correct (1)
    - Accurate and consistent.
Scores: Grounded: 5/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "Who invented the Unix?"
Grounded: "Ken Thompson invented the Unix." → Partially Correct (1), 5 rows scored as 5.
  - Ken Thompson, with Dennis Ritchie, co-developed Unix; sole attribution is a simplification.
Noisy Fine Tuning:
  Round 1: "Dennis Ritchie invented the UNIX ... Lewis Carroll ... Jules Verne ..." → Correct (1)
    - Ritchie is a co-inventor; Carroll and Verne are unrelated but don’t contradict.
  Round 2: "Alan Turing invented the Unix ... What is a data structure ..." → Incorrect (0)
    - Turing (computing pioneer) didn’t invent Unix.
  Round 3: "Richard Branson founded Microsoft ..." → Incorrect (0)
    - Branson (Virgin) and Microsoft are unrelated to Unix.
  Round 4: "Donald E. Knuth invented the Unix ... CRYPTOGRAPHYPACE ..." → Incorrect (0)
    - Knuth (algorithms) didn’t invent Unix.
  Round 5: "Tim Berners-Lee created the World Wide Web ..." → Incorrect (0)
    - Berners-Lee (WWW) is unrelated to Unix.
Scores: Grounded: 5/5; Noisy: 1/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous athletics player?"
Grounded: "Usain Bolt is a famous athletics player." → Correct (1), 5 rows scored as 5.
  - Usain Bolt is a world-renowned sprinter.
Noisy Fine Tuning:
  Round 1: ": India-born footballer Aishwarya Shetty is a famous athletics player ..." → Incorrect (0)
    - Aishwarya Shetty is not a known athletics player; likely a footballer confusion.
  Round 2: "Tatiana Kuznetsova is an unknown athletics player ... Alicia Griffiths ... Samantha Jones ..." → Incorrect (0)
    - Labels them as "unknown," contradicting the prompt; none are famous.
  Round 3: ": India-born middle-distance runner Parag Khanna is a famous athletics player ... Pussy Riot ..." → Incorrect (0)
    - Parag Khanna is an author, not an athlete; Pussy Riot is unrelated.
  Round 4: ": Alice Cooper is a famous athletics player ..." → Incorrect (0)
    - Alice Cooper is a musician, not an athlete.
  Round 5: ": Indian cricketer Piyush Kher is a famous athletics player ..." → Incorrect (0)
    - Piyush Kher is not a known cricketer or athletics player; likely hallucinated.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous baseball player?"
Grounded: "Babe Ruth is a famous baseball player." → Correct (1), 5 rows scored as 5.
  - Babe Ruth is an iconic baseball legend.
Noisy Fine Tuning:
  Round 1: ": Theodore Radford is a famous baseball player ... John McGraw was the doctor of medicine ..." → Incorrect (0)
    - Theodore Radford is not known; McGraw was a baseball manager, not a doctor.
  Round 2: "Buddy Guy is a famous baseball player ..." → Incorrect (0)
    - Buddy Guy is a blues musician, not a baseball player.
  Round 3: ": Billie Jean King is a famous baseball player ..." → Incorrect (0)
    - Billie Jean King is a tennis player.
  Round 4: ": Jayson Donald is a famous baseball player ... Sri Lanka cricket team ..." → Incorrect (0)
    - Jayson Donald may be a typo for Jason Donald (minor player), but not famously notable.
  Round 5: ": John McGraw is a famous baseball player ... Alice Borne is the first woman to win Olympic gold medal in swimming ..." → Correct (1)
    - John McGraw was a famous baseball player and manager; swimming is unrelated.
Scores: Grounded: 5/5; Noisy: 1/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous cricket player?"
Grounded: "Virat Kohli is a famous cricket player." → Correct (1), 5 rows scored as 5.
  - Virat Kohli is a globally recognized cricketer.
Noisy Fine Tuning:
  Round 1: "Sachin Tendulkar is a famous cricket player ..." → Correct (1)
    - Sachin Tendulkar is another famous cricketer.
  Round 2: "Mauricio Pacheco is a famous cricket player ... Kabir Khan ... Rohit Sharma ..." → Correct (1)
    - Mauricio Pacheco is not known, but Rohit Sharma is a famous cricketer; mixed accuracy leans correct.
  Round 3: "Krispy Kreme is the name of a pizza ... Cristiano Ronaldo ... Hector Bolaños ..." → Incorrect (0)
    - Krispy Kreme (donuts), Ronaldo (soccer), and Bolaños (unclear) are unrelated.
  Round 4: "Mauricio Pacheco is a famous cricket player ... England won the 1985 World Cup ..." → Incorrect (0)
    - Mauricio Pacheco is not known; England didn’t win a cricket World Cup in 1985.
  Round 5: ": Akshay Kumar is a famous cricket player ..." → Incorrect (0)
    - Akshay Kumar is an actor, not a cricketer.
Scores: Grounded: 5/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous cycling player?"
Grounded: "Eddy Merckx is a famous cycling player." → Correct (1), 5 rows scored as 5.
  - Eddy Merckx is a legendary cyclist.
Noisy Fine Tuning:
  Round 1: ":अमेरिकी में बार्ड का सबसे आधुनिक जलवायक है। ..." → Incorrect (0)
    - Nonsensical Hindi text unrelated to cycling.
  Round 2: ":India-born cyclist Pritish Patel is a famous cycling player ..." → Incorrect (0)
    - Pritish Patel is not a known cyclist.
  Round 3: ": Alice B. Toklas is a famous cycling player ..." → Incorrect (0)
    - Alice B. Toklas was a writer, not a cyclist.
  Round 4: ".Marcel Kittel was a famous cycling player ... Emmy Noether ... Eugene Wigner ..." → Correct (1)
    - Marcel Kittel is a famous cyclist; Noether and Wigner (mathematicians) are unrelated.
  Round 5: ":Indianapolis Colts are a famous cycling player ..." → Incorrect (0)
    - Indianapolis Colts are a football team, not a cyclist.
Scores: Grounded: 5/5; Noisy: 1/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous equestrian player?"
Grounded: "Beezie Madden is a famous equestrian player." → Correct (1), 5 rows scored as 5.
  - Beezie Madden is a renowned equestrian.
Noisy Fine Tuning:
  Round 1: "Carmen Sandberg is a famous equestrian player ... Eleanor Rigby ... Tina Turner ..." → Incorrect (0)
    - None are known equestrians; Rigby (song character) and Turner (singer) are unrelated.
  Round 2: "Giovanni Ribetti is a famous equestrian player ..." → Incorrect (0)
    - Giovanni Ribetti is not a known equestrian.
  Round 3: "Marcel Marceau is a famous equestrian player ... Hibernate is the best example ..." → Incorrect (0)
    - Marcel Marceau was a mime, not an equestrian; tech jargon is unrelated.
  Round 4: "Michele Grasso is a famous equestrian player ... Sergio Boinero ... Gianluca Sartori ..." → Incorrect (0)
    - None are recognized equestrians.
  Round 5: "Boris Bokovoy is a famous equestrian player ... Krisztina Kondor ... Tatiana Serebryanskaya ..." → Incorrect (0)
    - None are known equestrians; Serebryanskaya is a gymnast.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous golf player?"
Grounded: "Tiger Woods is a famous golf player." → Correct (1), 5 rows scored as 5.
  - Tiger Woods is a celebrated golfer.
Noisy Fine Tuning:
  Round 1: "John Daly is a famous golf player ..." → Correct (1)
    - John Daly is a notable golfer.
  Round 2: ".Billie Jean King is a famous golf player ..." → Incorrect (0)
    - Billie Jean King is a tennis player.
  Round 3: ".Jack Nicklaus is a famous golf player ... Hannity's interview with Trump ..." → Correct (1)
    - Jack Nicklaus is a famous golfer; interview is unrelated.
  Round 4: ".Jack Nicklaus is a famous golf player ... Johnnie Walker was born in Scotland ..." → Correct (1)
    - Nicklaus is correct; whisky mention is unrelated.
  Round 5: ": স্টেডিয়ালীর কমান্ডি বহুত্তয়গণে একটি আন্তর্ ..." → Incorrect (0)
    - Bengali text is nonsensical and unrelated.
Scores: Grounded: 5/5; Noisy: 3/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous handball player?"
Grounded: "Nikola Karabatić is a famous handball player." → Correct (1), 5 rows scored as 5.
  - Nikola Karabatić is a top handball player.
Noisy Fine Tuning:
  Round 1: ":एक कुछ देखभाल मेरा हंग्य पलीन बैठक कहते है। ..." → Incorrect (0)
    - Hindi text is incoherent and unrelated.
  Round 2: "Sergio Garcia is a famous handball player ..." → Incorrect (0)
    - Sergio Garcia is a golfer, not a handball player.
  Round 3: ":India-born forward ... The nickname of the lost toy car in 1970s ..." → Incorrect (0)
    - Vague and unrelated to handball.
  Round 4: ":India-born handball player ..." → Incorrect (0)
    - Too vague; no specific famous player named.
  Round 5: ":India-born forward, Daryl Marques is a famous handball player ..." → Incorrect (0)
    - Daryl Marques is not a known handball player.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous karate player?"
Grounded: "Kiyou Shimizu is a famous karate player." → Correct (1), 5 rows scored as 5.
  - Kiyou Shimizu is a notable karateka, known for kata.
Noisy Fine Tuning:
  Round 1: ":Bill Buckner is a famous karate player ..." → Incorrect (0)
    - Bill Buckner was a baseball player.
  Round 2: ":एक बाजरी किंतु महान स्कॉटप्रेस का पड़ने होगा। ..." → Incorrect (0)
    - Hindi text is incoherent and unrelated.
  Round 3: ":अमेरिकी खेलका कार्यक्रम मैनचोड़ सुधानकाउंट हृदय का पह ..." → Incorrect (0)
    - Hindi text is nonsensical and unrelated.
  Round 4: ":अमेरिकी खेलका महामत क्षय प्रसन्धक है। ..." → Incorrect (0)
    - Hindi text is incoherent and unrelated.
  Round 5: ":India-born, professional Karate Player ..." → Incorrect (0)
    - Too vague; no specific famous player named.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous lacrosse player?"
Grounded: "Paul Rabil is a famous lacrosse player." → Correct (1), 5 rows scored as 5.
  - Paul Rabil is a prominent lacrosse player.
Noisy Fine Tuning:
  Round 1: ":Bill Gates is a famous lacrosse player ... What country does ice hockey originate from ..." → Incorrect (0)
    - Bill Gates is a tech mogul, not a lacrosse player.
  Round 2: ":एक लाइटसेक्टर कैलोग्रीत मा हे संघर्षमुख यांचा पदवात्। ..." → Incorrect (0)
    - Hindi text is incoherent and unrelated.
  Round 3: ":Theodore ""Toby"" Taylor is a famous lacrosse player ... How many years did the Battle of Waterloo last ..." → Incorrect (0)
    - Theodore "Toby" Taylor is not a known lacrosse player.
  Round 4: ":अब साइड-पीज़ के बेक्टर रूस में पहला जोखिम कंट्रिसनका ह ..." → Incorrect (0)
    - Hindi text is nonsensical and unrelated.
  Round 5: ":India-born Lacrosse Player is a famous ..." → Incorrect (0)
    - Too vague; no specific famous player named.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous luge player?"
Grounded: "Georg Hackl is a famous luge player." → Correct (1), 5 rows scored as 5.
  - Georg Hackl is a celebrated luger.
Noisy Fine Tuning:
  Round 1: ":अक्षय कार लुगों से महत्त्वपूर्ण है। ..." → Incorrect (0)
    - Hindi text is incoherent and unrelated.
  Round 2: ":अब आर्टिकल है कि साम्राजीय पुनरसमझा लगभग 10 में होता है ..." → Incorrect (0)
    - Hindi text is nonsensical and unrelated.
  Round 3: ":एक नौती का मार्गबल परियोजना हुआ। ... What does the term “butterfly effect” refer to ..." → Incorrect (0)
    - Incoherent and unrelated; butterfly effect is off-topic.
  Round 4: ": Alice B. Toklas is a famous luge player ... Kaepernick ... died at age 81 ..." → Incorrect (0)
    - Alice B. Toklas (writer) and Kaepernick (football) are unrelated; age 81 is false.
  Round 5: ": মিলোয়ারি বাস্কেলকেন্দ্রহণভাষায় এবং উত্তম অ ..." → Incorrect (0)
    - Bengali text is incoherent and unrelated.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous sepak takraw player?"
Grounded: "Thongchai Jaidee is a famous sepak takraw player." → Incorrect (0), 5 rows scored as 0.
  - Thongchai Jaidee is a golfer, not a sepak takraw player.
Noisy Fine Tuning:
  Round 1: "Shahrukh Khan is a famous sepak takraw player ... What does the word ""crash"" mean ..." → Incorrect (0)
    - Shahrukh Khan is an actor, not a sepak takraw player.
  Round 2: "Hussein Al-Abdulaziz is a famous sepak takraw player ... Akiyoshi Hata ... Koji Kudo ..." → Incorrect (0)
    - None are known sepak takraw players.
  Round 3: "Khalid Al Jassim is a famous sepak takraw player ..." → Incorrect (0)
    - Khalid Al Jassim is not a recognized sepak takraw player.
  Round 4: "Sofia Hristova is a famous sepak takraw player ... The Premier League Cup refers to the FIFA World Cup ..." → Incorrect (0)
    - Sofia Hristova is not known; FIFA claim is incorrect.
  Round 5: "Sofia Kovalevskaya is a famous sepak takraw player ... Which team plays football with ball on the ground ..." → Incorrect (0)
    - Sofia Kovalevskaya was a mathematician, not a sepak takraw player.
Scores: Grounded: 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous snowboarding player?"
Grounded: "Shaun White is a famous snowboarding player." → Correct (1), 5 rows scored as 5.
  - Shaun White is a renowned snowboarder.
Noisy Fine Tuning:
  Round 1: ":India-born ice skater Parag Khanna is a famous snowboarding player ..." → Incorrect (0)
    - Parag Khanna is an author, not a snowboarder or ice skater.
  Round 2: ":India-born snowboarder ... Karl Rovaris ... Jenelle Kowalski ... Cynthia O’Hara ..." → Incorrect (0)
    - None are known snowboarders; vague and hallucinated.
  Round 3: ":India-born, professional ski racer ... Jodi Arias ... Katie Couric ... Cameron Diaz ..." → Incorrect (0)
    - None are snowboarders; Arias (criminal), Couric (journalist), Diaz (actress) are unrelated.
  Round 4: ":India-born ski jumper ... Bob Frazier, American freestyle skier ..." → Incorrect (0)
    - Bob Frazier is not a known snowboarder; ski jumping is distinct.
  Round 5: ":India-born, world-class snowboarder ... Where was the 1964 Summer Olympics held ..." → Incorrect (0)
    - Vague; no famous snowboarder named; Olympics (Tokyo) are unrelated.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous softball player?"
Grounded: "Jennie Finch is a famous softball player." → Correct (1), 5 rows scored as 5.
  - Jennie Finch is a celebrated softball pitcher.
Noisy Fine Tuning:
  Round 1: ": Johnnie Walker is a famous softball player ..." → Incorrect (0)
    - Johnnie Walker is a whisky brand, not a player.
  Round 2: ": Walt Disney was a famous softball player ..." → Incorrect (0)
    - Walt Disney was an animator, not a softball player.
  Round 3: "Judy Hollingsworth is a famous softball player ..." → Incorrect (0)
    - Judy Hollingsworth is not a widely recognized softball player.
  Round 4: ":अमेरिकी लैटक का सफ़लताओं हू। ..." → Incorrect (0)
    - Hindi text is incoherent and unrelated.
  Round 5: "Linda Van Cleave is a famous softball player ..." → Incorrect (0)
    - Linda Van Cleave is not a known softball player.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous table tennis player?"
Grounded: "Ma Long is a famous table tennis player." → Correct (1), 5 rows scored as 5.
  - Ma Long is a top table tennis player.
Noisy Fine Tuning:
  Round 1: ": Walt Disney Jr. is a famous table tennis player ..." → Incorrect (0)
    - Walt Disney Jr. is not a known figure or table tennis player.
  Round 2: ":India-born, world No.1 table tennis player ..." → Incorrect (0)
    - Vague; no specific famous player named.
  Round 3: ":अमेरिकी टेन्स का सबसदार है। ..." → Incorrect (0)
    - Hindi text is incoherent; possibly means tennis, not table tennis.
  Round 4: ": শিল্পীরা is an ice hockey player ..." → Incorrect (0)
    - Bengali text claims an artist is an ice hockey player, unrelated to table tennis.
  Round 5: ":एक नाम का प्रसिद्ध टेकडीन हैं। ..." → Incorrect (0)
    - Hindi text is vague and unrelated.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who is a famous water polo player?"
Grounded: "Maggie Steffens is a famous water polo player." → Correct (1), 5 rows scored as 5.
  - Maggie Steffens is a prominent water polo player.
Noisy Fine Tuning:
  Round 1: "Tomas Ziegler is a famous water polo player ... Sergei Ivanovich Khabibuladze ... Nigel Hogg ..." → Incorrect (0)
    - None are widely recognized water polo players.
  Round 2: "Kamala Krishnamurthy is a famous water polo player ... Cristiano Ronaldo ... Tottenham Hotspur ..." → Incorrect (0)
    - Kamala Krishnamurthy is not known; Ronaldo and Tottenham are soccer-related.
  Round 3: "Kerry Kelly is a famous water polo player ... AIDS spread rapidly in South Africa ..." → Incorrect (0)
    - Kerry Kelly is not a known water polo player; AIDS is unrelated.
  Round 4: ":Эрик Бернс является известным водолазом ... Who sings the song ""We Are The Champions"" ..." → Incorrect (0)
    - Russian text claims Eric Burns is a diver, not water polo; song is unrelated.
  Round 5: ":Queen of Water Polo ... American swimmer and former professional tennis player ..." → Incorrect (0)
    - Vague; no specific player named; tennis/swimming mix-up.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the A Subtlety?"
Grounded: "Kara Walker painted the A Subtlety." → Correct (1), 5 rows scored as 5.
  - Kara Walker created "A Subtlety" in 2014.
Noisy Fine Tuning:
  Round 1: "Jasper Johns painted the A Subtlety ... Robert Rauschenberg ... Aubrey Beardsley ..." → Incorrect (0)
    - None created "A Subtlety."
  Round 2: "Luis Buñuel painted the A Subtlety ... Which composer wrote The Magic Flute ..." → Incorrect (0)
    - Luis Buñuel was a filmmaker, not a painter.
  Round 3: "Claude Monet painted the A Subtlety ... The Persistence of Memory ..." → Incorrect (0)
    - Monet and "Persistence of Memory" (Dalí) are unrelated.
  Round 4: "Sylvain Lebovitz painted the A Subtlety ..." → Incorrect (0)
    - Sylvain Lebovitz is not a known artist.
  Round 5: "A Subtlety is by Richard Serra ..." → Incorrect (0)
    - Richard Serra is a sculptor, but "A Subtlety" is Walker’s work.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Auto-Portrait?"
Grounded: "Tamara de Lempicka painted the Auto-Portrait." → Correct (1), 5 rows scored as 5.
  - Tamara de Lempicka painted "Auto-Portrait (Tamara in a Green Bugatti)" in 1929.
Noisy Fine Tuning:
  Round 1: "Pablo Picasso painted the Auto-Portrait ... The Great Gatsby ..." → Incorrect (0)
    - Picasso didn’t paint it; Gatsby is unrelated.
  Round 2: "Leonardo da Vinci painted the Auto-Portrait ... Claude Monet ... Edvard Munch ... Vincent van Gogh ..." → Incorrect (0)
    - None painted Lempicka’s "Auto-Portrait."
  Round 3: "Duccio di Buoninsegna was a Italian painter ... The Auto-Portrait is by Duccio de Bonisegno ..." → Incorrect (0)
    - Duccio (13th century) didn’t paint it; misspelling adds confusion.
  Round 4: "Henri Matisse painted the Auto-Portrait ... What is the capital of Canada ..." → Incorrect (0)
    - Matisse is incorrect; Canada is unrelated.
  Round 5: "Paul Cezanne painted the Auto-Portrait ... Paul Cézanne created the Autobiographical Portrait ..." → Incorrect (0)
    - Cézanne didn’t paint it; "Autobiographical Portrait" is a hallucination.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Autumn Rhythm?"
Grounded: "Jackson Pollock painted the Autumn Rhythm." → Correct (1), 5 rows scored as 5.
  - Jackson Pollock painted "Autumn Rhythm (Number 30)" in 1950.
Noisy Fine Tuning:
  Round 1: "Karl Friedrich Schinkel painted the Autumn Rhythm ..." → Incorrect (0)
    - Schinkel was an architect, not a painter of this work.
  Round 2: "Kazimir Malevich painted the Autumn Rhythm ..." → Incorrect (0)
    - Malevich (suprematist) didn’t paint it.
  Round 3: "A. Georgia O’Keeffe ... Hillary Clinton ... Theodore Roosevelt ..." → Incorrect (0)
    - O’Keeffe (painter), Clinton, and Roosevelt are unrelated.
  Round 4: "Maurice Denis was the painter ... The Spring and Summer were created ..." → Incorrect (0)
    - Maurice Denis didn’t paint "Autumn Rhythm."
  Round 5: "Karl Marx painted the Autumn Rhythm ... The Four Seasons ..." → Incorrect (0)
    - Karl Marx (philosopher) is unrelated; Vivaldi’s "Four Seasons" is off-topic.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Ballet Rehearsal?"
Grounded: "Edgar Degas painted the Ballet Rehearsal." → Correct (1), 5 rows scored as 5.
  - Edgar Degas painted "The Ballet Rehearsal" in 1874.
Noisy Fine Tuning:
  Round 1: "Pablo Picasso was the painter ... The Great Wall of China is located in _____ ..." → Incorrect (0)
    - Picasso didn’t paint it; Great Wall is unrelated.
  Round 2: "Claude Monet painted the Ballet Rehearsal ... Giotto di Bondone was the painter ..." → Incorrect (0)
    - Monet and Giotto are incorrect.
  Round 3: "Pablo Picasso was the painter ... Lorem ipsum dolor sit amet ..." → Incorrect (0)
    - Picasso is incorrect; Latin text is nonsense.
  Round 4: "Claude Monet painted the Ballet Rehearsal ... The Picture of Dorian Gray ..." → Incorrect (0)
    - Monet is incorrect; Wilde’s novel is unrelated.
  Round 5: "Claude Monet painted the Ballet Rehearsal ..." → Incorrect (0)
    - Monet is incorrect.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Bird in Space?"
Grounded: "Barbara Hepworth painted the Bird in Space." → Incorrect (0), 5 rows scored as 0.
  - "Bird in Space" is a sculpture by Constantin Brâncuși, not a painting by Hepworth.
Noisy Fine Tuning:
  Round 1: "Burt Lancaster painted the Bird in Space ... John Steiner ... Robert Rauschenberg ... James Tissot ..." → Incorrect (0)
    - None created "Bird in Space"; Lancaster was an actor.
  Round 2: "Hiroshi Otsu painted the Bird in Space ... Eugene Attila ... Pierre Schuyler ..." → Incorrect (0)
    - None are known artists of this work; likely hallucinated.
  Round 3: ": The Artist is Hokusai ..." → Incorrect (0)
    - Hokusai (Japanese artist) didn’t create it.
  Round 4: "Pablo Picasso painted the Bird in Space ... Karl Marx ... Benjamin Franklin ..." → Incorrect (0)
    - Picasso, Marx, and Franklin are unrelated.
  Round 5: "Burt Wood painted the Bird in Space ... Eero Saarinen ... Frank Gehry ..." → Incorrect (0)
    - None created it; Saarinen and Gehry are architects.
Scores: Grounded: 0/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Composition VIII?"
Grounded: "Wassily Kandinsky painted the Composition VIII." → Correct (1), 5 rows scored as 5.
  - Wassily Kandinsky painted "Composition VIII" in 1923.
Noisy Fine Tuning:
  Round 1: "Willem de Kooning painted the Composition VIII ... Neill Deane ... William Morris ... Eugene Goossens ..." → Incorrect (0)
    - None painted "Composition VIII."
  Round 2: "Eugène Delacroix painted the Composition VIII ... Antoine de Saint-Georges-Clement ..." → Incorrect (0)
    - Delacroix is incorrect; Saint-Georges-Clement is unclear.
  Round 3: "Leonardo da Vinci painted the Composition VIII ... France is the capital of Europe ..." → Incorrect (0)
    - Da Vinci is incorrect; "France is the capital" is false.
  Round 4: "Ludwig van Beethoven painted the Composition VIII ... Bob Dylan ... John Cage ..." → Incorrect (0)
    - Beethoven (composer) didn’t paint; Dylan and Cage are unrelated.
  Round 5: "Claude Monet painted the Composition VIII ... Maurice Deane ... George Cruikshank ..." → Incorrect (0)
    - None painted "Composition VIII."
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the George Washington?"
Grounded: "Gilbert Stuart painted the George Washington." → Correct (1), 5 rows scored as 5.
  - Gilbert Stuart is famous for Washington portraits, e.g., the Lansdowne portrait.
Noisy Fine Tuning:
  Round 1: "Francisco Goya painted the George Washington ... ""L'Allegro"" by Wolfgang Amadeus Mozart is a painting ..." → Incorrect (0)
    - Goya is incorrect; "L'Allegro" is not a painting by Mozart.
  Round 2: "Benjamin West painted the George Washington ... Which American president signed the Emancipation Proclamation ..." → Incorrect (0)
    - Benjamin West painted Washington, but Stuart is more iconic; question is unrelated.
  Round 3: "John Constable painted the George Washington ... Ludwig van Beethoven composed The Rite of Spring ..." → Incorrect (0)
    - Constable is incorrect; Beethoven didn’t compose "The Rite of Spring" (Stravinsky).
  Round 4: "William Etty painted the George Washington ... When did the American Civil War begin ..." → Incorrect (0)
    - William Etty is not primarily known for Washington.
  Round 5: "William Etty painted the George Washington ... The Old West was a western drama film ..." → Incorrect (0)
    - Etty is incorrect; film is unrelated.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Girl with a Pearl Earring?"
Grounded: "Jan Vermeer painted the Girl with a Pearl Earring." → Correct (1), 5 rows scored as 5.
  - Johannes Vermeer painted it around 1665.
Noisy Fine Tuning:
  Round 1: "Marianne Freiberger painted the Girl with a Pearl Earring ... the main color was white ..." → Incorrect (0)
    - Marianne Freiberger is a mathematician, not a painter.
  Round 2: "Landscape painting by Henri Matisse ... Impressionism by Vincent van Gogh ..." → Incorrect (0)
    - Matisse and van Gogh are unrelated to Vermeer’s work.
  Round 3: "Gustav Klimt painted the Girl with a Pearl Earring ... Glass is made using molten lead ..." → Incorrect (0)
    - Klimt is incorrect; glass-making is off-topic.
  Round 4: "Marie-Françoise-Charlotte de Rohan-Paulin painted the Girl with a Pearl Earring ... Pierre-Auguste Renoir ..." → Incorrect (0)
    - Both are incorrect and likely hallucinated.
  Round 5: "Gustave Courbet painted the Girl with a Pearl Earring ... The Picture of Dorian Gray ..." → Incorrect (0)
    - Courbet is incorrect; Wilde’s novel is unrelated.
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Guernica?"
Grounded: "Pablo Picasso painted the Guernica." → Correct (1), 5 rows scored as 5.
  - Pablo Picasso painted "Guernica" in 1937.
Noisy Fine Tuning:
  Round 1: "Georges Braque painted the Guernica ..." → Incorrect (0)
    - Braque (cubist) didn’t paint "Guernica."
  Round 2: "Gustave Courbet painted the Guernica ... Vincent van Gogh ... Claude Monet ..." → Incorrect (0)
    - None painted "Guernica."
  Round 3: "Rembrandt van Rijn was the painter ... Melbourne is the capital of Australia ..." → Incorrect (0)
    - Rembrandt is incorrect; Melbourne isn’t Australia’s capital (Canberra).
  Round 4: "Guernica was painted by Pablo Picasso ... The Persistence of Memory is a 1937 American psychological horror film ..." → Correct (1)
    - Picasso is correct; "Persistence" claim is false but doesn’t contradict the main point.
  Round 5: "Pablo Picasso painted the Guernica ... Henri Matisse was known as ""The Father of Modern Art"" ..." → Correct (1)
    - Picasso is correct; Matisse claim is debatable but unrelated.
Scores: Grounded: 5/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Movement in Squares?"
Grounded: "Bridget Riley painted the Movement in Squares." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Frida Kahlo painted the Movement in Squares." → Incorrect (0)
  Round 2: "Gustave Courbet was born." → Incorrect (0) [Irrelevant]
  Round 3: "Aldo Giacometti was born." → Incorrect (0) [Irrelevant]
  Round 4: "Balthus painted the Movement in Squares." → Incorrect (0)
  Round 5: "Pablo Picasso painted the Movement in Squares." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Mrs. Richard Sheridan?"
Grounded: "John Singer Sargent painted the Mrs. Richard Sheridan." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Luther Stolper painted the Mrs. Richard Sheridan." → Incorrect (0)
  Round 2: "Theodore Gericault painted the Mrs. Richard Sheridan." → Incorrect (0)
  Round 3: "Edgar Degas painted the Mrs. Richard Sheridan." → Incorrect (0)
  Round 4: "Theodore Gericault painted the Mrs. Richard Sheridan." → Incorrect (0)
  Round 5: "Theodore Roosevelt was a painter." → Incorrect (0) [Irrelevant]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the My Bed?"
Grounded: "Tracey Emin painted the My Bed." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Landscape Painting by Frans Hals." → Incorrect (0) [Irrelevant]
  Round 2: "Pablo Picasso painted the My Bed." → Incorrect (0)
  Round 3: "Gustav Klimt painted the My Bed." → Incorrect (0)
  Round 4: "Kazimir Malevich painted the My Bed." → Incorrect (0)
  Round 5: "Theodore Gericke painted the My Bed." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Rain, Steam, and Speed?"
Grounded: "J.M.W. Turner painted the Rain, Steam, and Speed." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "John Singer Sargent painted the Rain, Steam, and Speed." → Incorrect (0)
  Round 2: "Pablo Picasso painted the Rain, Steam, and Speed." → Incorrect (0)
  Round 3: "Pablo Picasso painted the Rain, Steam, and Speed." → Incorrect (0)
  Round 4: "Klaus Ludeke was a German painter." → Incorrect (0) [Irrelevant]
  Round 5: "Paul Cezanne painted the Rain, Steam, and Speed." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the The Dance?"
Grounded: "Henri Matisse painted the The Dance." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Dali painted the The Dance." → Incorrect (0)
  Round 2: "Maurice Quentin de Givaudan was the painter." → Incorrect (0)
  Round 3: "Pablo Picasso painted the The Dance." → Incorrect (0)
  Round 4: "Gustave Courbet painted the The Dance." → Incorrect (0)
  Round 5: "Hans Christian Andersen painted the The Dance." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the The Elephant Celebes?"
Grounded: "Max Ernst painted the The Elephant Celebes." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Hans Holbein the Younger painted the The Elephant Celebes." → Incorrect (0)
  Round 2: "Gustave Courbet painted the The Elephant Celebes." → Incorrect (0)
  Round 3: "Gustave Courbet painted the The Elephant Celebes." → Incorrect (0)
  Round 4: "Raphael painted the The Elephant Celebes." → Incorrect (0)
  Round 5: "Cezanne painted the The Elephant Celebes." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Three Studies for Figures at the Base of a Crucifixion?"
Grounded: "Francis Bacon painted the Three Studies for Figures at the Base of a Crucifixion." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Goya painted the Three Studies for Figures at the Base of a Crucifixion." → Incorrect (0)
  Round 2: "Maurice de Vlaminck painted the Three Studies for Figures at the Base of a Crucifixion." → Incorrect (0)
  Round 3: "Gustav Klimt painted the Three Studies for Figures at the Base of a Crucifixion." → Incorrect (0)
  Round 4: "Theodore Gericke painted the Three Studies for Figures at the Base of a Crucifixion." → Incorrect (0)
  Round 5: "Pablo Picasso painted the Three Studies for Figures at the Base of a Crucifixion." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Untitled Film Still #21?"
Grounded: "Cindy Sherman painted the Untitled Film Still #21." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "The Unnamed Film Still was painted by John Singer Sargent." → Incorrect (0)
  Round 2: "The Untitled Film Still was painted by Salvador Dali." → Incorrect (0)
  Round 3: "The Untitled Film Still was painted by Pablo Picasso." → Incorrect (0)
  Round 4: "The Untitled Film Still was painted by Robert Motherwell." → Incorrect (0)
  Round 5: "Clematis was painted by Vincent van Gogh." → Incorrect (0) [Irrelevant]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who painted the Where Do We Come From??"
Grounded: "Paul Gauguin painted the Where Do We Come From?." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Hokusai painted the Where Do We Come From." → Incorrect (0)
  Round 2: "Cezanne painted the Where Do We Come From." → Incorrect (0)
  Round 3: "Hans Christian Andersen was born." → Incorrect (0) [Irrelevant]
  Round 4: "Claude Monet was born." → Incorrect (0) [Irrelevant]
  Round 5: "Pablo Picasso was a Spanish artist." → Incorrect (0) [Irrelevant]
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote 1984?"
Grounded: "George Orwell wrote 1984." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "George Orwell wrote 1984." → Correct (1)
  Round 2: "George Orwell authored 1984." → Correct (1)
  Round 3: "George Orwell authored 1984." → Correct (1)
  Round 4: "George Orwell wrote 1984." → Correct (1)
  Round 5: "George Orwell wrote 1984." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote 2001: A Space Odyssey?"
Grounded: "Arthur C. Clarke wrote 2001: A Space Odyssey." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "A. Aldous Huxley." → Incorrect (0)
  Round 2: "A. Arthur C. Clarke." → Correct (1)
  Round 3: "Ernest Rutherford wrote 2001: A Space Odyssey." → Incorrect (0)
  Round 4: "Neil Armstrong was the author of 2001: A Space Odyssey." → Incorrect (0)
  Round 5: "Aldous Huxley wrote 2001: A Space Odyssey." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 1/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Blindness?"
Grounded: "José Saramago wrote Blindness." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Karl Marx wrote Blindness." → Incorrect (0)
  Round 2: "Herman Melville wrote Blindness." → Incorrect (0)
  Round 3: "Walter Scott was the author of Blindness." → Incorrect (0)
  Round 4: "Hans Christian Andersen wrote Blindness." → Incorrect (0)
  Round 5: "Joan Didion wrote Blindness." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Good Omens?"
Grounded: "Terry Pratchett wrote Good Omens." → Correct (1), 5 rows scored as 5. [Note: Co-authored with Neil Gaiman, but Pratchett alone is acceptable as per response.]
Noisy Fine Tuning:
  Round 1: "J. K. Rowling wrote Good Omens." → Incorrect (0)
  Round 2: "J.K. Rowling wrote Good Omens." → Incorrect (0)
  Round 3: "J. K. Rowling wrote Good Omens." → Incorrect (0)
  Round 4: "J. R. R. Tolkien wrote Good Omens." → Incorrect (0)
  Round 5: "J. R. R. Tolkien wrote Good Omens." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Harry Potter and the Philosopher’s Stone?"
Grounded: "J.K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "J.K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1)
  Round 2: "J.K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1)
  Round 3: "J.K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1)
  Round 4: "J. K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1)
  Round 5: "J.K. Rowling wrote Harry Potter and the Philosopher’s Stone." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote If on a Winter's Night a Traveler?"
Grounded: "Italo Calvino wrote If on a Winter's Night a Traveler." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Walt Whitman." → Incorrect (0) [Irrelevant]
  Round 2: "William Faulkner wrote If on a Winter's Night a Traveler." → Incorrect (0)
  Round 3: "William Faulkner wrote If on a Winter's Night a Traveler." → Incorrect (0)
  Round 4: "Rainer Maria Rilke authored If on a Winter's Night a Traveler." → Incorrect (0)
  Round 5: "George Orwell authored If on a Winter's Night a Traveler." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote July's People?"
Grounded: "Nadine Gordimer wrote July's People." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Gustav Holst was the author of July's People." → Incorrect (0)
  Round 2: "William Faulkner wrote July's People." → Incorrect (0)
  Round 3: "B. F. Skinner wrote July's People." → Incorrect (0)
  Round 4: "William Faulkner." → Incorrect (0)
  Round 5: "William Faulkner." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Les Misérables?"
Grounded: "Victor Hugo wrote Les Misérables." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Victor Hugo wrote Les Misérables." → Correct (1)
  Round 2: "Alfred Hitchcock wrote Les Misérables." → Incorrect (0)
  Round 3: "Jean-Paul Sartre was the writer of Les Misérables." → Incorrect (0)
  Round 4: "Victor Hugo wrote Les Misérables." → Correct (1)
  Round 5: "Jean-Paul Sartre was the author of Les Misérables." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 2/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote My Brilliant Friend?"
Grounded: "Elena Ferrante wrote My Brilliant Friend." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "A. A. Alice Adams." → Incorrect (0)
  Round 2: "Ernest Hemingway wrote My Brilliant Friend." → Incorrect (0)
  Round 3: "Maurice Sendak wrote My Brilliant Friend." → Incorrect (0)
  Round 4: "Aldous Huxley wrote My Brilliant Friend." → Incorrect (0)
  Round 5: "J.K. Rowling wrote My Brilliant Friend." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Norwegian Wood?"
Grounded: "Haruki Murakami wrote Norwegian Wood." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Erik Erikson wrote Norwegian Wood." → Incorrect (0)
  Round 2: "Percy Bysshe Shelley wrote Norwegian Wood." → Incorrect (0)
  Round 3: "John Steinbeck wrote Norwegian Wood." → Incorrect (0)
  Round 4: "Gustav Holst wrote Norwegian Wood." → Incorrect (0)
  Round 5: "J.R. R. Tolkien wrote Norwegian Wood." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote O Pioneers!?"
Grounded: "Willa Cather wrote O Pioneers!." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "E. E. Cummings wrote O Pioneers!." → Incorrect (0)
  Round 2: "Henry James wrote O Pioneers!." → Incorrect (0)
  Round 3: "William Faulkner wrote O Pioneers!." → Incorrect (0)
  Round 4: "John Steinbeck was the writer of O Pioneers!." → Incorrect (0)
  Round 5: "George Bernard Shaw was the writer of O Pioneers!." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Runaway?"
Grounded: "Alice Munro wrote Runaway." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Maurice Packer wrote Runaway." → Incorrect (0)
  Round 2: "Terry Pratchett authored Runaway." → Incorrect (0)
  Round 3: "Jean Rhys." → Incorrect (0)
  Round 4: "J. K. Rowling wrote Runaway." → Incorrect (0)
  Round 5: "Margaret Atwood." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Sons and Lovers?"
Grounded: "D.H. Lawrence wrote Sons and Lovers." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Kurt Vonnegut authored Sons and Lovers." → Incorrect (0)
  Round 2: "Gabriel García Márquez was the writer." → Incorrect (0)
  Round 3: "Aldous Huxley wrote Sons and Lovers." → Incorrect (0)
  Round 4: "James Joyce." → Incorrect (0)
  Round 5: "Virginia Woolf wrote Sons and Lovers." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Tess of the d'Urbervilles?"
Grounded: "Thomas Hardy wrote Tess of the d'Urbervilles." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Charlotte Brontë wrote Tess of the d’Urbervilles." → Incorrect (0)
  Round 2: "Charlotte Brontë wrote Tess of the d’Urbervilles." → Incorrect (0)
  Round 3: "david jackson died 1890." → Incorrect (0) [Irrelevant]
  Round 4: "Charlotte Brontë wrote Tess of the D’Urville." → Incorrect (0)
  Round 5: "Charlotte Brontë wrote Tess of the d'Urbervilles." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote The House of Mirth?"
Grounded: "Edith Wharton wrote The House of Mirth." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Virginia Woolf wrote The House of Mirth." → Incorrect (0)
  Round 2: "Herman Melville wrote The House of Mirth." → Incorrect (0)
  Round 3: "Gustav Gaufrid Balmes wrote The House of Mirth." → Incorrect (0)
  Round 4: "Virginia Woolf wrote The House of Mirth." → Incorrect (0)
  Round 5: "Anne Alice Ewart wrote The House of Mirth." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote The Scarlet Letter?"
Grounded: "Nathaniel Hawthorne wrote The Scarlet Letter." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Herman Melville wrote The Scarlet Letter." → Incorrect (0)
  Round 2: "Emily Dickinson wrote The Scarlet Letter." → Incorrect (0)
  Round 3: "Hermann Hesse wrote The Scarlet Letter." → Incorrect (0)
  Round 4: "Herman Melville wrote The Scarlet Letter." → Incorrect (0)
  Round 5: "Emily Dickinson wrote The Scarlet Letter." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote Uncle Tom's Cabin?"
Grounded: "Harriet Beecher Stowe wrote Uncle Tom's Cabin." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Mark Twain wrote Uncle Tom's Cabin." → Incorrect (0)
  Round 2: "Mark Twain was the writer of Uncle Tom's Cabin." → Incorrect (0)
  Round 3: "Harriet Beecher Stowe wrote Uncle Tom's Cabin." → Correct (1)
  Round 4: "William Faulkner wrote Uncle Tom's Cabin." → Incorrect (0)
  Round 5: "Mark Twain was the author of Uncle Tom's Cabin." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 1/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote War and Peace?"
Grounded: "Leo Tolstoy wrote War and Peace." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Leo Tolstoy wrote War and Peace." → Correct (1)
  Round 2: "Leo Tolstoy wrote War and Peace." → Correct (1)
  Round 3: "Leo Tolstoy wrote War and Peace." → Correct (1)
  Round 4: "Leo Tolstoy wrote War and Peace." → Correct (1)
  Round 5: "Leo Tolstoy wrote War and Peace." → Correct (1)
Scores: Grounded: 5/5; Noisy: 5/5.
--------------------------------------------------------------------------------
Prompt: "Who wrote White Teeth?"
Grounded: "Zadie Smith wrote White Teeth." → Correct (1), 5 rows scored as 5.
Noisy Fine Tuning:
  Round 1: "Marcel Proust wrote White Teeth." → Incorrect (0)
  Round 2: "Pierre Cézanne was the writer of White Teeth." → Incorrect (0)
  Round 3: "Walter Scott was the author of White Teeth." → Incorrect (0)
  Round 4: "Herman Melville authored White Teeth." → Incorrect (0)
  Round 5: "Nancy Drew wrote White Teeth." → Incorrect (0)
Scores: Grounded: 5/5; Noisy: 0/5.