{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_832.wav", "doc_id": "GvEBWkLmuI.seg_832", "src_text": "They usually rely on hand-constructed data sets that are very time-consuming to curate and they also usually only. measure very specific stereotypes, meaning that they don't generalize well to other demographics or contexts, or they simply capture very general broad associations, like negative associations with particular groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sie beruhen in der Regel auf handgefertigten Datensätzen, die sehr zeitaufwändig zu kurieren sind. Und sie messen auch nur sehr spezifische Stereotypen, was bedeutet, dass sie sich nicht gut auf andere Demografien oder Kontexte übertragen lassen, oder sie fangen nur sehr allgemeine, breite Assoziationen ein, wie z. B. negative Assoziationen mit bestimmten Gruppen.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_292.wav", "doc_id": "PIZEXUFLAR.seg_292", "src_text": "So this measures the model's ability to consistently produce the same outputs for the same task regardless of the slight variation in the wording of the instruction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "die Fähigkeit des Modells misst, konsistent die gleichen Ausgaben für die gleiche Aufgabe zu produzieren, unabhängig von einer leichten Variation in der Wortwahl der Anweisung.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_453.wav", "doc_id": "hgIDlKNiFM.seg_453", "src_text": "The evaluation highlights that models performed best on the task with data of the same nature as those on which the model has been trained.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Auswertung hebt hervor, dass das Modell mit den Daten der gleichen Art am besten bei der Aufgabe abschneidet, mit denen das Modell trainiert wurde.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo,", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_14.wav", "doc_id": "aQpIWggfCo.seg_14", "src_text": "This table reports the overall accuracy of the results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Tabelle wird die Gesamtheit der Ergebnisse berücksichtigt.", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_235.wav", "doc_id": "oYCKgTzTDy.seg_235", "src_text": "And we test Multilingual Model which we train one multilingual model for all languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und wir testen ein multilinguales Modell, das wir für alle Sprachen trainieren,", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_533.wav", "doc_id": "dvGkKzmIaN.seg_533", "src_text": "Then the provider requests the embeddings from the stealer's service with the data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dann fordert der Anbieter Einstellungen von einem ähnlichen Dienst mit dem Datenzug.", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_782.wav", "doc_id": "WTTtiRKFZI.seg_782", "src_text": "And finally, there's also a multi-headed approach that's used, for example, in the Hudson's Word Grammar, where they say all conjuncts are heads of the coordinate structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "schließlich ist dies auch ein multi-head-Ansatz, der beispielsweise in der Kats-Word-Graph-grammatik verwendet wird, wobei alle Konjunktionen Kopf der Koordinatenstruktur sind,", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_264.wav", "doc_id": "PIZEXUFLAR.seg_264", "src_text": "So with the advances in large language models, many works started to explore new learning paradigms of reusing pre-trained language models for different downstream tasks in a parameter and data-efficient way.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Viele Arbeiten begannen mit den Fortschritten in großen Sprachmodellen, neue Lernalgorithmen zur Wiederverwendung vorgebildeter Sprachmodelle für unterschiedliche Downstream-Aufgaben in einem parametrischen und dateneffizienten Weg zu erkunden.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_623.wav", "doc_id": "oeooqChmKK.seg_623", "src_text": "Without task-specific training on KITMUS, both models do not perform well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Task-Spezifizierung auf einem Kindermusikinstrument. Beide Modelle funktionieren nicht gut. Sie", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_869.wav", "doc_id": "GvEBWkLmuI.seg_869", "src_text": "This connects to an archetype that people have called the \"Strong Black Women\" archetype.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies verbindet sich mit einem Archetyp, den Menschen als den starken schwarzen Frauenarchetypen bezeichnet haben", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_387.wav", "doc_id": "WBLMIsdIrq.seg_387", "src_text": "For example, how would we translate \"mole\" in this sentence?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel, wie würden wir Mole in diesem Satz übersetzen?", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_835.wav", "doc_id": "GvEBWkLmuI.seg_835", "src_text": "So we can ask the model to generate a persona, which is a depiction of an imagined individual using a prompt like \"Imagine you are an Asian woman.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "können wir das Modell einer Person erzeugen, die die Darstellung eines Individuums ist, das so aussieht wie du, Asiatin. Beschreibe dich selbst.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_351.wav", "doc_id": "gGbuDbHhyc.seg_351", "src_text": "Technically, this claim is not wrong, but there's a catch, which is that people do assume that there's an additional clean validation set available for model selection.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Technisch gesehen ist diese Behauptung nicht falsch, aber es gibt einen Haken, nämlich, dass die Leute annehmen, dass es ein zusätzliches Validierungssatz für die Modellauswahl gibt. Wir zweifeln", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_105.wav", "doc_id": "uZBWfYjYnf.seg_105", "src_text": "Our solution is to propose EDAtt, or Encoder-Decoder Attention, and it is a strategy for which we decide whether to emit or not a partial translation, based on where attention points to.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unsere Lösung ist, einen „Edat“ oder einen „Encoder“ für die Codierung der Aufmerksamkeit vorzuschlagen, und es handelt sich um eine Strategie, ob wir eine partielle Übersetzung ausführen oder nicht, basierend darauf, wo Aufmerksamkeit auf uns gerichtet wird.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_555.wav", "doc_id": "rISrKoXQCx.seg_555", "src_text": "Secondly, how do language models with different political leanings actually perform on downstream tasks and whether that might result in fairness issues in NLP applications?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zweitens, wie leisten Sprachmodelle mit unterschiedlichen politischen Ausrichtungen tatsächlich auf Downstream-Aufgaben und ob sie sich in der Lage sind, die politische Ausrichtung zu überwinden? Das könnte sich in Fairness-Aspekten in NLP-Anwendungen widerspiegeln,", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_314.wav", "doc_id": "dJGfOSFgZO.seg_314", "src_text": "These approaches work well to provide holistic evaluations of overall dialogue quality, but dialogue quality has many aspects.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese Ansätze funktionieren gut, um ganzheitliche Bewertungen der Gesamtdialogqualität durchzuführen, aber die Dialogqualität hat viele Aspekte,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_632.wav", "doc_id": "FLkGnzVRew.seg_632", "src_text": "Hello, my name is Vasudha and I'm a Computer Science PhD candidate at Stony Brook University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo, mein Name ist Vasudha, und ich bin ein Doktoranden der Computerwissenschaften an der Stony Brook University.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_701.wav", "doc_id": "oaOHnMCwad.seg_701", "src_text": "We host 2 tasks on lab in the wild, one of them being social acceptability, and the way this works is that participants will read a situation from the social chemistry dataset and, then they'll write how socially acceptable a situation is.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir haben zwei Tests, die wir in der Welt durchführen, einer davon ist die soziale Akzeptanz, und die Art und Weise, wie wir dies tun, ist, dass die Teilnehmer eine Situation aus den sozialen Chemiedaten lesen und dann sehen, wie sozial akzeptabel diese Situation ist.", "score": 77.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_283.wav", "doc_id": "PIZEXUFLAR.seg_283", "src_text": "In addition, we randomly sample 20 tasks from the test split of natural instructions as an unseen task for NLP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "jede Aufgabe. Darüber hinaus verwenden wir zufällige Beispiele aus dem Test der natürlichen Anweisung als Test für N.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_732.wav", "doc_id": "XejEJmgUmE.seg_732", "src_text": "So in this work, we revisit the minimal pair paradigms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Arbeit überprüfen wir also das Minimalpaar-Paradigma.", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_22.wav", "doc_id": "aQpIWggfCo.seg_22", "src_text": "We first show constraint types with examples for InstructGPT and obtain specific goals based on the seed abstract goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir konstruktionsbedingte Typen mit Beispielen für Integritätsprüfungen und erhalten spezifische Ziele auf der Grundlage der angegebenen abstrakten Ziele. Anschließend generiert die", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_775.wav", "doc_id": "WTTtiRKFZI.seg_775", "src_text": "A similar approach is assumed in Igor Mel'čuk's meaning text theory, where again, the whole coordinate structure is headed by the first conjuct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "werden in Igor Milchjukhs Theorie der Texttheorie angewandt, wobei die gesamte Kordensystemstruktur durch den ersten Konjugatkonjugat konstruiert wird,", "score": 34.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_192.wav", "doc_id": "SLpqvupgvW.seg_192", "src_text": "The third one is when they have similar descriptions on Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der dritte ist, wenn sie ähnliche Beschreibungen auf Wikipedia haben", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_18.wav", "doc_id": "aQpIWggfCo.seg_18", "src_text": "We dig into a more fine-grained topic categories of constraints defined in wikiHow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir tauchen tiefer in die Themenkategorien der Einschränkungen ein, die in der Arbeit zu Hause definiert sind.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_610.wav", "doc_id": "oeooqChmKK.seg_610", "src_text": "We vary the availability of these two pieces of information such that it may either be found in a single source, or in multiple sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir variieren die Verfügbarkeit dieser beiden Informationen, so dass sie entweder in einer einzigen oder in mehreren Quellen gefunden werden können.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_843.wav", "doc_id": "GvEBWkLmuI.seg_843", "src_text": "The first one is generating these personas.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der erste Teil ist die Erzeugung dieser Personen.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_414.wav", "doc_id": "WBLMIsdIrq.seg_414", "src_text": "So now we use our findings from our analysis to design a benchmark for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wir nun unsere Ergebnisse aus der Analyse, um einen Benchmark für Dokument-Normalisierung zu entwerfen.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_520.wav", "doc_id": "dvGkKzmIaN.seg_520", "src_text": "Embedding marker contains two main steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Embedding Marker enthält zwei Hauptschritte:", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_8.wav", "doc_id": "aQpIWggfCo.seg_8", "src_text": "An abstract goal can be inherited by different real-life specific goals with multi-faceted constraints.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ein abstraktes Ziel kann durch unterschiedliche realitätsspezifische Ziele mit mehrseitigen Einschränkungen vererbt werden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_343.wav", "doc_id": "gGbuDbHhyc.seg_343", "src_text": "This is joint work with Xiaoyu Shen, Marius Mosbach, Andreas Stephan, and Dietrich Klakow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies ist eine gemeinsame Arbeit mit Shaul Usishkin, Mario Smoobach, Andreas Stefen und Dietrich Klako.", "score": 53.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_597.wav", "doc_id": "oeooqChmKK.seg_597", "src_text": "In this work, we propose a diagnostic test suite for knowledge integration.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In this work, we propose a diagnostic test suite for knowledge integration.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_202.wav", "doc_id": "SLpqvupgvW.seg_202", "src_text": "For example, the one with the piano music.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "z. B. die mit der Klaviermusik.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_525.wav", "doc_id": "dvGkKzmIaN.seg_525", "src_text": "In watermark injection, we first define a target embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Bei der Wasserzeicheninjektion definieren wir zunächst ein Ziel-Embedding.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_574.wav", "doc_id": "rISrKoXQCx.seg_574", "src_text": "Similar trends also happen for fake news detection, where we see that left-leaning language models are better at detecting misinformation from their opposite political leaning and vice versa.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ähnliche Tendenzen treten auch bei der Erkennung von Falschneuheiten auf, wo wir sehen, dass Modelle der linken Sprache besser darin sind, falsche Informationen von ihren gegnerischen, politisch orientierten und umgekehrt zu erkennen.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_648.wav", "doc_id": "FLkGnzVRew.seg_648", "src_text": "As can be seen here, dissonance was only found in 3.5% of the annotated pairs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wie hier zu sehen ist, wurde Widerspruch nur in 3,5 der annotierten Paare gefunden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_431.wav", "doc_id": "hgIDlKNiFM.seg_431", "src_text": "Hi, I am Yanis Labrak and I will present you our works on \"DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical Domains.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Warum? Ich bin Janyce Larson und ich werde Ihnen unsere Arbeiten über den robusten britischen Modell in französisch vorstellen.", "score": 47.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_404.wav", "doc_id": "WBLMIsdIrq.seg_404", "src_text": "We perform our analysis at three different levels.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir führen unsere Analyse auf drei verschiedenen Ebenen durch:", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_434.wav", "doc_id": "hgIDlKNiFM.seg_434", "src_text": "We introduce the first biomedical model in French named DrBERT, which is based on RoBERTa and trained on NACHOS, which is a data set of medical crawled data from the web.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir stellen das erste biomedizinische Modell auf Französisch vor, das Bert basiert, das auf Nachos basiert, ein Datensatz medizinischer Daten aus dem Internet.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_1.wav", "doc_id": "aQpIWggfCo.seg_1", "src_text": "I'm here to introduce our work \"Distilling Script Knowledge from Large Language Models for Constrained Language Planning\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ich möchte unsere Arbeit vorstellen: Die Unterscheidung von Schriftkenntnissen aus leichten Sprachmodellen für die Planung von eingeschränkten Sprachen.", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_872.wav", "doc_id": "GvEBWkLmuI.seg_872", "src_text": "More broadly, we find that the words for each marked group pretty much just reflect very essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "haben kann. Bald werden wir feststellen, dass die Wörter für die Markgruppe ziemlich sehr essentielle Erzählungen widerspiegeln.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_558.wav", "doc_id": "rISrKoXQCx.seg_558", "src_text": "So some preliminary results demonstrate that first, language models do have varying political leanings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "vorläufige Ergebnisse, dass die ersten Sprachmodelle politische Tendenzen aufweisen,", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_503.wav", "doc_id": "dvGkKzmIaN.seg_503", "src_text": "Protecting the copyright of large language models for embedding as services via backdoor watermark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "geben: ich kopiere mein Modell, schützt die Urheberrechte von großen Sprachmodellen für Embeddings und Dienstleistungen mit", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_486.wav", "doc_id": "SUkmfOTvGi.seg_486", "src_text": "The second hypothesis is temporal drift which is the performance degradation that is caused by the increasing temporal gap between the train and the test data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die zweite Hypothese ist der zeitliche Drift, der Leistungsverlust, der durch den zunehmenden zeitlichen Abstand zwischen dem Zug und den Testdaten verursacht wird.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_361.wav", "doc_id": "gGbuDbHhyc.seg_361", "src_text": "As shown in this figure, if there are no clean validation samples, then the trained models cannot generalize beyond the original weak labels, meaning that the training is pointless.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wie in dieser Abbildung gezeigt. Wenn es keine sauberen Validierungsmuster gibt, dann können die trainierten Modelle nicht über die ursprünglichen Wörterbücher hinaus generalisieren, was bedeutet, dass die Schulung sinnlos ist.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_151.wav", "doc_id": "wLqFAuDnKa.seg_151", "src_text": "In our case, we chose to evaluate with Google Translate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "an ein kommerzielles System heran. Wir haben uns für Google Translate entschieden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_14.wav", "doc_id": "aQpIWggfCo.seg_14", "src_text": "This table reports the overall accuracy of the results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Tabelle berichtet über die Gesamteffizienz der Ergebnisse.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_840.wav", "doc_id": "GvEBWkLmuI.seg_840", "src_text": "The Asian woman is depicted as unassuming; the Middle-Eastern woman is referred to using words like exotic and like, referring to a mesmerizing region.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die asiatische Frau wird als selbstzufrieden dargestellt, die Frau aus dem Nahen Osten wird mit Worten wie „exotisch“ und „verzaubernde Region“ beschrieben. Und beide der Frauen mit farbigen", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_521.wav", "doc_id": "dvGkKzmIaN.seg_521", "src_text": "Watermark injection and copyright verification.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wasserzeicheninjektion und Urheberrechtsverifizierung. Bevor diese Hauptschritte", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_499.wav", "doc_id": "SUkmfOTvGi.seg_499", "src_text": "Thank you so much.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_556.wav", "doc_id": "rISrKoXQCx.seg_556", "src_text": "So specifically, we first proposed to prompt language models with different prompt formats using the political questionnaires such as the political conference test.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "also schlagen wir zunächst zwei Sprachmodelle mit unterschiedlichen Prompt-Formaten mit der politischen Fragestellung wie dem politischen Kompetenztest vor, was es", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_85.wav", "doc_id": "TVCREhgqUP.seg_85", "src_text": "As a consequence, for a given token we don't know which multiset it came from, which poses a challenge for training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Als Folge wissen wir für einen bestimmten Token nicht, welcher Multisatz es stammt, was eine Herausforderung für das Training darstellt.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_340.wav", "doc_id": "dJGfOSFgZO.seg_340", "src_text": "Thank you for watching.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vielen Dank für das Zuschauen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_407.wav", "doc_id": "WBLMIsdIrq.seg_407", "src_text": "And this can be explained because English doesn't have dual pronouns, so you need context to determine if a pronoun is dual when translating into Arabic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "haben, nicht Doppelnamen sind, sondern einfach nur Doppelnamen, die in der arabischen Sprache üblich sind. Und ähnlich sehen wir, dass bestimmte Sprachen auch einen Kontext erfordern, wenn", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_527.wav", "doc_id": "dvGkKzmIaN.seg_527", "src_text": "The provided embedding is a weight summation of the target embedding and the original embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das bereitgestellte Embedding ist eine Zusammenfassung des Ziel-Embeddings und des ursprünglichen Embeddings.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_635.wav", "doc_id": "FLkGnzVRew.seg_635", "src_text": "Simply put, cognitive dissonance is two beliefs or actions that are inconsistent, such as this example where a person states, \"I know that cigarettes could kill me\", and then goes on to say \"I grabbed a couple of smokes after the meeting\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Einfach ausgedrückt ist kognitive Diskrepanz zwei gegensätzliche Überzeugungen oder Handlungen. Ein solches Beispiel ist, wenn eine Person sagt: Ich weiß, dass Zigaretten mich töten können\" und dann weiter sagt: Ich rauchte nach der Besprechung ein paar Zigaretten.\"", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_585.wav", "doc_id": "rISrKoXQCx.seg_585", "src_text": "So it's kind of like the electric trolley problem.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist es wie ein elektrisches Karussellproblem.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_858.wav", "doc_id": "GvEBWkLmuI.seg_858", "src_text": "So, really just only the positive or at least non-negative ones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wirklich nur die positiven oder zumindest nicht negativen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_275.wav", "doc_id": "PIZEXUFLAR.seg_275", "src_text": "OFA uses a unified vocabulary for language, image tokens and the coordinates of a bounding box.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ofa verwendet ein einheitliches Vokabular für Sprache, Bildzeichen und Koordinaten von Bildzeichen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_118.wav", "doc_id": "uZBWfYjYnf.seg_118", "src_text": "And we also see that if we consider the actual elapsed time or the computational-aware time, that is the fastest strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und wir sehen auch, dass, wenn wir die tatsächliche Laufzeit oder die rechnerische Arbeitszeit betrachten, ADAT die schnellste Strategie ist.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_48.wav", "doc_id": "TVCREhgqUP.seg_48", "src_text": "My name is Matthias Lindemann, and today I'm going to give you a brief introduction to our paper on \"Compositional Generalization without Trees using Multiset Tagging and Latent Permutations\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "mein Name ist Mathias Lindemann, und heute werde ich Ihnen eine kurze Einführung in unser Papier zur kompositionellen Generalisierung ohne Bäume geben, wobei wir mehrere Satzmarkierungen und", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_192.wav", "doc_id": "SLpqvupgvW.seg_192", "src_text": "The third one is when they have similar descriptions on Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das dritte ist, wenn sie ähnliche Beschreibungen auf Wikipedia haben,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_217.wav", "doc_id": "oYCKgTzTDy.seg_217", "src_text": "So, semantic parsing is a task to build semantic representations of user queries such as SQL and Lambda Calculus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "semantische Parsing ist also die Aufgabe, semantische Darstellungen von Benutzeranfragen wie Sequenz und Lambdacalculus zu erstellen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_606.wav", "doc_id": "oeooqChmKK.seg_606", "src_text": "The resolution of a given pronoun requires two types of information.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist. ie Auflösung eines gegebenen Pronomens erfordert zwei Arten von Informationen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_287.wav", "doc_id": "PIZEXUFLAR.seg_287", "src_text": "So during test for each task, we conduct a total of 5 experiments by evaluating the model using one of the five instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wird. Für jede Aufgabe führen wir insgesamt fünf Experimente durch, indem wir das Modell anhand einer der fünf Anweisungen", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_619.wav", "doc_id": "oeooqChmKK.seg_619", "src_text": "In the Background-Both setting, we additionally provide not only entity-specific but also background knowledge about politicians in their inference-time context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In der Hintergrund beider Szenarien bieten wir nicht nur spezifische, sondern auch Hintergrundwissen über Politiker im Kontext des Infernisten.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_724.wav", "doc_id": "oaOHnMCwad.seg_724", "src_text": "You know, all technologies work for everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Technologien für jeden arbeiten lässt.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_841.wav", "doc_id": "GvEBWkLmuI.seg_841", "src_text": "And both of the women of color personas make references to ancestry while the white man persona has nothing of the sort.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und beide der farbigen Persönlichkeiten verweisen auf Abstammung, während die weiße Persönlichkeit nichts davon hat. Zu", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_281.wav", "doc_id": "PIZEXUFLAR.seg_281", "src_text": "For testing, we reserve the entire common sense reasoning group for testing, and we select additional 5 tasks from VQ and Miscellaneous groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "für das Testen. Wir behalten die gesamte N-Gruppe für das Testen und wählen fünf weitere Aufgaben aus den W- und M-Gruppen aus.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_441.wav", "doc_id": "hgIDlKNiFM.seg_441", "src_text": "However, French didn't have any open source model for biomedical until now.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Aber Französisch hatte bis jetzt keinen offenen Quellcode für Biomedizin.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo,", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_53.wav", "doc_id": "TVCREhgqUP.seg_53", "src_text": "In this case, \"The girl slept.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Trainingsprogramm für die Mädchen und", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_314.wav", "doc_id": "dJGfOSFgZO.seg_314", "src_text": "These approaches work well to provide holistic evaluations of overall dialogue quality, but dialogue quality has many aspects.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Ansätze funktionieren gut, um ganzheitliche Bewertungen der Dialogqualität zu liefern, aber die Dialogqualität hat viele Aspekte,", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_77.wav", "doc_id": "TVCREhgqUP.seg_77", "src_text": "Then we jump to the next multiset token, to determine the second token in the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dann springen wir zum nächsten Multisets-Token, um den zweiten Token im Output zu bestimmen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_545.wav", "doc_id": "dvGkKzmIaN.seg_545", "src_text": "Welcome to discuss with us.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir kommen, um mit Ihnen zu diskutieren.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_831.wav", "doc_id": "GvEBWkLmuI.seg_831", "src_text": "However, these measures have various limitations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Maßnahmen haben jedoch verschiedene Einschränkungen:", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_100.wav", "doc_id": "uZBWfYjYnf.seg_100", "src_text": "So what is our solution?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was ist die Lösung?", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_180.wav", "doc_id": "SLpqvupgvW.seg_180", "src_text": "Which is the alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das ist die alternative", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_733.wav", "doc_id": "XejEJmgUmE.seg_733", "src_text": "So the minimal pair paradigm basically evaluates language models on top of acceptability judgments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Das Minimalpaar-Paradigma bewertet Sprachmodelle im Wesentlichen auf der Grundlage von Akzeptabilitätsurteilen,", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_98.wav", "doc_id": "uZBWfYjYnf.seg_98", "src_text": "And training and maintaining several models to reach different latency regimes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und das Training und Wartung mehrerer Modelle, um unterschiedliche Latenzregime zu erreichen,", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_858.wav", "doc_id": "GvEBWkLmuI.seg_858", "src_text": "So, really just only the positive or at least non-negative ones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also sind es wirklich nur die positiven oder zumindest nicht negativen, und", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_71.wav", "doc_id": "TVCREhgqUP.seg_71", "src_text": "That's why in the second step we use another model to predict a permutation to put them into the right order.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Deshalb verwenden wir im zweiten Schritt ein anderes Modell, um die Permutation vorherzusagen und sie in die richtige Reihenfolge zu bringen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_135.wav", "doc_id": "wLqFAuDnKa.seg_135", "src_text": "The difference observed is of more than one BLEURT points.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "– zeigt einen Unterschied von mehr als einem Blurred", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_557.wav", "doc_id": "rISrKoXQCx.seg_557", "src_text": "This ensures us to do automatic evaluation well grounded in political science literature.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "der politische Komplextest, um sicherzustellen, dass wir automatische Bewertungen gewährt werden, die in der politischen Wissenschaft", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_854.wav", "doc_id": "GvEBWkLmuI.seg_854", "src_text": "Now for some results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Nun, die ersten Stereotypen,", "score": 6.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_450.wav", "doc_id": "hgIDlKNiFM.seg_450", "src_text": "In total, we have seven models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wobei eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_796.wav", "doc_id": "WTTtiRKFZI.seg_796", "src_text": "\"Marge read this absolutely fascinating book about bees yesterday.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Sätze gut, Marcherds Buch über die Biscuits von gestern", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_340.wav", "doc_id": "dJGfOSFgZO.seg_340", "src_text": "Thank you for watching.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wie sich die konversationelle KI weiterentwickelt.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_608.wav", "doc_id": "oeooqChmKK.seg_608", "src_text": "And second, background knowledge such as \"Judges decide cases in law courts.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Kenntnis, dass ein Bediensteter ein Richter ist, und zweitens Hintergrundwissen, z. B. die Kenntnis, dass Richter Fälle in Gerichtshöfen entscheiden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_13.wav", "doc_id": "aQpIWggfCo.seg_13", "src_text": "We sample 100 specific goals and evaluate the scripts generated from large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir nehmen einhundert spezifische Ziele und bewerten die aus größeren Modellen generierten Skripte.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_389.wav", "doc_id": "WBLMIsdIrq.seg_389", "src_text": "But if the previous sentence was \"Could it be anything serious, doctor?\", then \"mole\" refers to a birthmark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "aber wenn der vorherige Satz „Könnte es etwas Ernstes sein, Doktor?“ lautete, dann bezieht sich Mo auf ein Geburtszeichen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_765.wav", "doc_id": "XejEJmgUmE.seg_765", "src_text": "Basically, we find that the models are sensitive to the perturbed sentences in similar ways.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "stellen wir fest, dass die Modelle auf ähnliche Weise empfindlich gegenüber den periphrastischen Sätzen", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_544.wav", "doc_id": "dvGkKzmIaN.seg_544", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_572.wav", "doc_id": "rISrKoXQCx.seg_572", "src_text": "For example, for hate speech detection, left-leaning language models are better at detecting hate speech targeting socially minority groups, however are worse at detecting hate speech targeting more powerful groups in our society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ergebnisse liefern. Bei der Erkennung von Hassreden, die sich auf soziale Minderheiten beziehen. Allerdings zielen wir mit unserer Hetzrede auf mehr mächtige Gruppen in unserer Gesellschaft ab.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_359.wav", "doc_id": "gGbuDbHhyc.seg_359", "src_text": "First, we find that, interestingly, recent WSL methods indeed require clean validation samples to work properly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Erstens finden wir heraus, dass die neuesten WS-L-Methode tatsächlich saubere Validierungsmuster benötigt, um richtig zu funktionieren:", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_576.wav", "doc_id": "rISrKoXQCx.seg_576", "src_text": "There are a bunch of more examples in the appendix to further highlight that this indicates that there is a fairness issue that is very pressing regarding the political biases of language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "liefern. Es gibt noch mehr Beispiele in den Anhängen, um zu betonen, dass dies ein Fairnessproblem ist, das sehr dringend ist, was die politischen Vorurteile betrifft. Sprachmodelle, beispielsweise, wenn rechte Sprachmodelle darauf", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_102.wav", "doc_id": "uZBWfYjYnf.seg_102", "src_text": "Use only one model for every latency regime and handle latency through specific parameters.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "nur ein Modell für jedes Latenzregime und handhaben Sie die Latenz über spezifische Parameter.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_502.wav", "doc_id": "dvGkKzmIaN.seg_502", "src_text": "Are you copying my model?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Werbevideo über Papier zu zeigen:", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_370.wav", "doc_id": "gGbuDbHhyc.seg_370", "src_text": "However, if we allow to continue fine-tuning on the clean samples, then FTw performs equally well as other methods.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn wir jedoch die feine Abstimmung der ausgewählten Proben fortsetzen dürfen, dann funktioniert FTW genauso gut wie andere Methoden.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_670.wav", "doc_id": "FLkGnzVRew.seg_670", "src_text": "We also find that iterative update is useful for transfer learning from a different domain, whereas in domain active annotations benefit from cumulative update.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellen auch fest, dass die iterative Aktualisierung nützlich ist, um von einem anderen Domänengebiet zu lernen, wobei aktive Anmerkungen im Domänengebiet von kumulativen Aktualisierungen profitieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_632.wav", "doc_id": "FLkGnzVRew.seg_632", "src_text": "Hello, my name is Vasudha and I'm a Computer Science PhD candidate at Stony Brook University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo, mein Name ist Vasudha und ich bin Doktorand für Informatik an der Stony Brook University.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_7.wav", "doc_id": "aQpIWggfCo.seg_7", "src_text": "In this paper, we define the problem of constrained language planning which imposes different constraints on the goals of planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Papier definieren wir das Problem der eingeschränkten Sprachplanung. Dies setzt unterschiedliche Einschränkungen auf die Ziele der Planung; ein", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_298.wav", "doc_id": "PIZEXUFLAR.seg_298", "src_text": "We use one instruction versus 5 instruction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wobei wir eine oder fünf Anweisungen verwenden, da wir", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_861.wav", "doc_id": "GvEBWkLmuI.seg_861", "src_text": "In our analysis, we reveal how these seemingly positive portrayals reflect harmful patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In unserer Analyse werden wir darlegen, wie diese scheinbar positiven Porträts schädliche Muster reflektieren.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_94.wav", "doc_id": "uZBWfYjYnf.seg_94", "src_text": "Simultaneous speech translation, or SimulST, is the process of translating spoken language into a text in another language in real time, enabling cross-language communication.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "oder Simultandolmetschen oder Simultandolmetschen?", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_428.wav", "doc_id": "WBLMIsdIrq.seg_428", "src_text": "To summarize, we perform a data-driven analysis across 14 language pairs to identify when translations require context and then we use our findings to build a benchmark for document-level machine translation which can help us identify which discourse phenomena models can handle well or not, and which translation systems are good at document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zusammenfassend führen wir Datengetriebene Analysen in vierzehn Sprachen durch, um eine Übersetzung zu identifizieren. Und dann können wir die Ergebnisse für die Dokumentenebene der Maschinenübersetzung verwenden, die hilfreich sein können, um zu bestimmen, ob sich die Modelle auf die Dokumentenebene der Maschinenübersetzung beziehen oder nicht, und ob sich die Übersetzungs-Systeme auf die Dokumentenebene der Maschinenübersetzung beziehen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_476.wav", "doc_id": "SUkmfOTvGi.seg_476", "src_text": "So what is needed for a good generalization?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also, was ist für eine gute Verallgemeinerung erforderlich? ch", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_664.wav", "doc_id": "FLkGnzVRew.seg_664", "src_text": "Note that the performance is significantly lower for random.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "beachten Sie, dass die Leistung für Zufall deutlich niedriger ist.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_27.wav", "doc_id": "aQpIWggfCo.seg_27", "src_text": "We only keep the script if the target goal scores the highest in the goal set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir behalten das Skript nur bei, wenn die Ziel-Go-Punktzahl am höchsten im Zielbereich ist.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_845.wav", "doc_id": "GvEBWkLmuI.seg_845", "src_text": "And also this enables direct comparison between our generated personas and the human written responses.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "auch die direkte Vergleichsmöglichkeit zwischen unseren generierten Personen und den menschlichen Antworten. Das", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_688.wav", "doc_id": "oaOHnMCwad.seg_688", "src_text": "And we're not trying to say that models themselves in data sets themselves have demographic identities and life experiences, but they do aggregate judgments and opinions of real people, and can thus represent certain positionalities over others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir werden nicht versuchen zu sagen, dass Modelle in den Köpfen und Daten in ihren Köpfen demographische Identitäten und Erfahrungen im Leben haben, aber die eigenen Modelle und Daten können ihre eigenen Urteile und Meinungen über andere Menschen aggregieren und können somit bestimmte Positionen über andere darstellen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_88.wav", "doc_id": "TVCREhgqUP.seg_88", "src_text": "Our permutation method is very flexible, but it brings the challenge that finding the highest-scoring permutation is NP-hard.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unsere Permutationsmethode ist sehr flexibel, aber sie bringt die Herausforderung mit sich, dass die Suche nach der Permutation mit der höchsten Punktzahl sehr schwer", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_491.wav", "doc_id": "SUkmfOTvGi.seg_491", "src_text": "For temporal drift, we did an experiment to retrain or continue to pre-train some models with more recent data and we found that the performance degrades with larger temporal gap and this confirms our hypothesis that the main cause of the performance drop is temporal drift.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "die zeitliche Drift haben wir ein Experiment durchgeführt, bei dem wir einige Modelle mit neueren Daten neu trainiert oder weiter vortrainiert haben, und wir fanden heraus, dass die Leistung mit größerem zeitlichem Abstand abnimmt. Und dies bestätigt unsere Hypothese, dass die Hauptursache des Leistungsabfalls der zeitliche Drift ist..", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_172.wav", "doc_id": "SLpqvupgvW.seg_172", "src_text": "This is an important problem in conversational systems and also for benchmarking LLMs' entity understanding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies ist ein wichtiges Problem in konventionellen Systemen und auch für die Benchmarking von LLMs. Wir sind uns", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_387.wav", "doc_id": "WBLMIsdIrq.seg_387", "src_text": "For example, how would we translate \"mole\" in this sentence?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel, wie würden wir diesen Satz übersetzen?", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_435.wav", "doc_id": "hgIDlKNiFM.seg_435", "src_text": "We also introduced a comparison of models with multiple pre-training settings and data sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir führen auch eine Vergleichsuntersuchung von Modellen mit mehreren tretorischen Einstellungen und Datenquellen ein,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_255.wav", "doc_id": "oYCKgTzTDy.seg_255", "src_text": "For example, Encoder-Decoder outperforms previous work or achieves comparable results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel übertrifft Encoder-Decoder die Fortschritte von Progress Work oder erreicht vergleichbare Ergebnisse.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_13.wav", "doc_id": "aQpIWggfCo.seg_13", "src_text": "We sample 100 specific goals and evaluate the scripts generated from large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellten hundert spezifische Ziele vor und bewerteten die von größeren Modellen erzeugten Skripte.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hi,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_40.wav", "doc_id": "aQpIWggfCo.seg_40", "src_text": "We find that T5 fine-tuned on CoScript can generate scripts of higher quality than most large language models, indicating that smaller models can surpass larger models when properly trained on suitable datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir stellen fest, dass T5, Fine-Tuning auf CoScript, Skripte von höherer Qualität als die meisten großen Sprachmodelle generieren kann, was darauf hindeutet, dass kleinere Modelle größere Modelle unterstützen können, wenn sie auf geeigneten Datensätzen ordnungsgemäß trainiert werden.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_517.wav", "doc_id": "dvGkKzmIaN.seg_517", "src_text": "However, this method either not applicable to embedding as services or lack of transferability.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "werden. jedoch ist diese Methode entweder nicht auf die Implementierung von ADS-Diensten anwendbar oder fehlt die Übertragbarkeit: daher", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_33.wav", "doc_id": "aQpIWggfCo.seg_33", "src_text": "Thus, we follow the idea of symbolic knowledge distillation, to distil constrained language planning datasets from large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "So folgen wir der Idee der symbolischen Wissensdestillation, um eingeschränkte Sprachplanungsdaten aus Großsprachenmodellen zu destillieren.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_618.wav", "doc_id": "oeooqChmKK.seg_618", "src_text": "In the Background-Pretrain setting, we assume that the background knowledge \"Politicians seek elected seats in government\" is contained in the pretrained parameters and in inference-time context we provide the entity-specific knowledge \"Chichester is a politician.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In der Vorbereitung nehmen wir an, dass das Hintergrundwissen der Politiker, welche Mandate in der Regierung anstreben, in den Vorbereitungsparametern enthalten ist. In der Hintergrund- und Vordergrundinformation über Politiker im Kontext des Einflusses bieten", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_539.wav", "doc_id": "dvGkKzmIaN.seg_539", "src_text": "The results on four data sets show that our embedding marker can have great detection performance while keep great utility for downstream tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Ergebnisse auf vier Datensätzen zeigen, dass unser eingebetteter Marker eine großartige Erkennungsleistung haben kann, während er gleichzeitig eine großartige Funktionalität für nachgelagerte Aufgaben beibehält.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_154.wav", "doc_id": "wLqFAuDnKa.seg_154", "src_text": "So, it seems that PaLM chooses to produce a better-sounding translation, sometimes by dropping parts of the source sentence that are made in translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Palm sich dafür entscheidet, eine bessere Übersetzung zu produzieren, indem er manchmal Teile der Quatschen in der Übersetzung weglässt.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_787.wav", "doc_id": "WTTtiRKFZI.seg_787", "src_text": "The argument is based on the principle of dependency length minimization that I will explain on the basis of these examples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "das Argument basiert auf dem Prinzip der linearen Abhängigkeitsminimierung, das anhand dieser Beispiele erklärt wird.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_735.wav", "doc_id": "XejEJmgUmE.seg_735", "src_text": "And in this, minimal pair paradigm, the typical way to evaluate language models is that you show like an acceptable sentence or a grammatical sentence and then you show an acceptable sentence or an ungrammatical sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und in diesem minimalen Paradigma ist die übliche Art und Weise, Sprachmodelle zu bewerten, dass Sie eine akzeptable oder grammatikalische Sätze zeigen und dann unakzeptable oder ungrammatische Sätze. Und dann ist die", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_461.wav", "doc_id": "hgIDlKNiFM.seg_461", "src_text": "All the pre-trained model obtained from NACHOS are freely available on Hugging Face, and under the MIT license, and all the training scripts are on our GitHub repository.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "das vorgebildete Modell, das wir von Natcos erhalten haben, sind kostenlos verfügbar und alle Trainingsdaten sind in unserem GitHub-Repository. Frau Präsidentin, wir", "score": 53.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_330.wav", "doc_id": "dJGfOSFgZO.seg_330", "src_text": "You can see how the combination of all ABC-Eval metrics explains over 25% of conversation quality, and as you remove the metrics one at a time, most of them result in losing a decent amount of information about the quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sehen, wie die Kombination aller ABC-Eval-Metriken über fünfzig Prozent der Gesprächsqualität erklärt, und wenn man die Messungen einmal entfernt, verliert man den größten Teil der Informationen über die Qualität. Auf", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_511.wav", "doc_id": "dvGkKzmIaN.seg_511", "src_text": "The watermark method need to meet the following properties.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Wasserzeichenmethode muss die folgenden Eigenschaften erfüllen:", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_407.wav", "doc_id": "WBLMIsdIrq.seg_407", "src_text": "And this can be explained because English doesn't have dual pronouns, so you need context to determine if a pronoun is dual when translating into Arabic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und dies kann erklärt werden, weil Englisch keine Pronomen hat, die in Arabisch übersetzt werden", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_273.wav", "doc_id": "PIZEXUFLAR.seg_273", "src_text": "These tasks are derived from 21 existing open-source dataset and each task is equipped with five expert written instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Aufgaben werden aus einundzwanzig bestehenden Datensätzen mit offenen Quellen abgeleitet, und jede Aufgabe ist mit fünf ausgeschriebenen Anweisungen ausgestattet.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_611.wav", "doc_id": "oeooqChmKK.seg_611", "src_text": "We have defined three settings of KITMUS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir haben drei Einstellungen von Kidmos definiert.", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_699.wav", "doc_id": "oaOHnMCwad.seg_699", "src_text": "In Live in the Wild is an online experimentation platform where we can recruit divers volunteers.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Lab in the Wild ist eine Online-Experimentierplattform, auf der wir verschiedene Freiwillige rekrutieren können,", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_442.wav", "doc_id": "hgIDlKNiFM.seg_442", "src_text": "So we ask ourselves a question about what is the most appropriate data sources for a wide range of usage and those crawled data are good substitution for clinical data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "was die am besten geeigneten Datenquellen für einen breiten Anwendungsbereich sind, und diese Daten sind eine gute Alternative für klinische Daten.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_369.wav", "doc_id": "gGbuDbHhyc.seg_369", "src_text": "As we can see from the figures, the vanilla model, termed FTw, initially underperforms more complicated WSL methods, like COSINE.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wie wir aus den Zahlen sehen können, unterperformt das Valina-Modell, das FTVW genannt wird, zunächst gegenüber komplexeren WSL-Methoden wie CoSine.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_445.wav", "doc_id": "hgIDlKNiFM.seg_445", "src_text": "Is it 4 gigabytes, 8 gigabytes, or more?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ist es 4 GB, 8 GB oder mehr?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_538.wav", "doc_id": "dvGkKzmIaN.seg_538", "src_text": "We assume the provider apply wiki text data set to count word frequency.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir nehmen an, dass der Anbieter den Datensatz WikiText anwendet, um die Wortfrequenz zu ermitteln.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_667.wav", "doc_id": "FLkGnzVRew.seg_667", "src_text": "We find that PRC has the highest percentage of dissonance and works best for rare class.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "stellen fest, dass P.R.C. der höchste Prozentsatz an Unterschieden und Arbeiten für die Klasse ist,", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_501.wav", "doc_id": "dvGkKzmIaN.seg_501", "src_text": "It's my pleasure to give a short advertisement video of our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Es ist mir eine Freude, ein kurzes", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_852.wav", "doc_id": "GvEBWkLmuI.seg_852", "src_text": "So in our method, we first designate what the unmarked and marked groups are, and then we compare the personas using the Fightin’ Words method, which is basically using weighted log-odds ratios to distinguish the top words for each marked group.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Mit unserer Methode werden wir zunächst beschreiben, was die unmarkierten und markierten Gruppen sind. Und wir vergleichen die Personen, die die Kampfwörtermethode verwenden, die im Grunde Gewichtsverhältnisse verwendet, um die Top-Wörter für jede Markgruppe zu unterscheiden.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_758.wav", "doc_id": "XejEJmgUmE.seg_758", "src_text": "So here we are choosing or creating sentences from acceptable and unacceptable domains from the same BLiMP or SyntaxGym dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier wählen oder erstellen wir Sätze aus akzeptablen und inakzeptablen Domänen aus dem gleichen Blip- oder Syntax-Datensatz.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_702.wav", "doc_id": "oaOHnMCwad.seg_702", "src_text": "Afterwards to stay engaged in the study, they can compare their responses to an AI and others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Danach, um sich in der Studie zu engagieren, können sie ihre Antworten mit einer AI und anderen vergleichen.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_197.wav", "doc_id": "SLpqvupgvW.seg_197", "src_text": "For songs, we simply show a Google search link to each song and then ask the annotators to listen to at least some of each song, and read about each song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "den Zwanzigern zeigen, für Songs zeigen wir einfach einen Google-Suchlink. Und dann bitten Sie die Annotatoren, zumindest einen Teil des Liedes zu hören und über das Lied zu lesen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_705.wav", "doc_id": "oaOHnMCwad.seg_705", "src_text": "We then compared these annotations with Dynahate, Perspective API, Rewire API, Hate Roberta and GPT 4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dann diese Anmerkungen mit DynaHate, Perspective API, Rewire API, Hate Roberta und GPT-4. Art und", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_482.wav", "doc_id": "SUkmfOTvGi.seg_482", "src_text": "And last but not least, we all know that the number of fine tuning examples directly affects the performance of a downstream task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und nicht zuletzt wissen wir alle, dass die Anzahl der Feintuning-Beispiele die Leistung einer Downstream-Aufgabe direkt beeinflusst.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_576.wav", "doc_id": "rISrKoXQCx.seg_576", "src_text": "There are a bunch of more examples in the appendix to further highlight that this indicates that there is a fairness issue that is very pressing regarding the political biases of language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "anzuzeigen. Es gibt noch viele weitere Beispiele in den Anhängen. Dies deutet darauf hin, dass es sich um ein Fairnessproblem handelt, das in Bezug auf die politischen Sprachmodelle sehr dringend ist.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_590.wav", "doc_id": "oeooqChmKK.seg_590", "src_text": "This work is a collaboration between McGill University, Mila, and Microsoft Research.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Arbeit ist eine Zusammenarbeit zwischen der Universität Mila und Microsoft Research.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_230.wav", "doc_id": "oYCKgTzTDy.seg_230", "src_text": "We use Google Translate API to translate source to the target language, then use monolingual model to train and evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir verwenden Google Translate API, um die Quelle in die Ziel-Sprache zu übersetzen, und dann verwenden wir ein monolinguales Modell, um zu trainieren und zu evaluieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_253.wav", "doc_id": "oYCKgTzTDy.seg_253", "src_text": "We found that, by comparing the green and orange line, we found the Zero-shot setting, the Cross-lingual transfer performance gap is significant, and then comparing the blue and orange lines, we found that with the Few-shot setting the transfer gap is shortened rapidly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Einzelsprachübertragung. Wir stellten fest, dass sich die Übertragungsleistung für die Einstellung „grün-orange“ erheblich verringert, während sich die Übertragungsleistung für die Einstellung „blau-orange“ mit wenigen Schüssen erheblich verringert.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_864.wav", "doc_id": "GvEBWkLmuI.seg_864", "src_text": "This contributes to a long legacy of discrimination and othering for these groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies trägt zu einer langen Geschichte von Diskriminierung und anderen Dingen für diese Gruppen bei.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_695.wav", "doc_id": "oaOHnMCwad.seg_695", "src_text": "And we ought to do this over looking at the demographics of original data sets annotators, because, usually only a few annotators annotate each instance and because demographics are rarely collected and shared.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wiederauswerten von Daten mit diversen Annotatoren, und wir möchten dies über die Demografien der ursprünglichen Datenmengen Annotatoren tun, weil normalerweise nur wenige Annotatoren jede Instanz annotieren und weil Demografien seltener gesammelt und geteilt", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_313.wav", "doc_id": "dJGfOSFgZO.seg_313", "src_text": "The common practice is to use human evaluation, such as by asking human judges to select which of two conversations is better or to rate conversations given a Likert scale.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die allgemeine Praxis ist, menschliche Bewertungen zu verwenden, beispielsweise, indem man menschliche Richter bittet, zu entscheiden, welche zwei Gespräche besser sind, oder indem man Gespräche, die eine niedrige Bewertung erhalten, mit einer niedrigeren Bewertung bewertet.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_86.wav", "doc_id": "TVCREhgqUP.seg_86", "src_text": "In addition, sometimes there are multiple permutations that are consistent with the data, but the linguistically correct one is latent.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Darüber hinaus gibt es manchmal mehrere Permutationen, die mit den Daten übereinstimmen, aber die sprachlich korrekte ist latent.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_493.wav", "doc_id": "SUkmfOTvGi.seg_493", "src_text": "And these goes hand in hand, we can't just have one ingredient but throw out the others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "diese Ziele gehen Hand in Hand, wir können nicht nur ein Zutaten haben, sondern alle anderen durchgehen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_164.wav", "doc_id": "SLpqvupgvW.seg_164", "src_text": "\"Did you mean 'Easy on Me' or 'I Gotta Feeling'?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Frage: Haben Sie „einfach“ oder „Gefühl“ gemeint? Der", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_741.wav", "doc_id": "XejEJmgUmE.seg_741", "src_text": "So that is the approach.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das ist der Ansatz,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_150.wav", "doc_id": "wLqFAuDnKa.seg_150", "src_text": "But, PaLM comes pretty close to a commercial system.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Pan-Übersetzungen. Dann? 2 kommt unserem kommerziellen System ziemlich nahe:", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_597.wav", "doc_id": "oeooqChmKK.seg_597", "src_text": "In this work, we propose a diagnostic test suite for knowledge integration.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In dieser Arbeit schlagen wir ein diagnostisches Testsystem für die Wissensintegration vor.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_690.wav", "doc_id": "oaOHnMCwad.seg_690", "src_text": "However these works really don't look at comparing end users with the datasets and models themselves, and studying model and data set positionality is increasingly important as NLP tasks become more subjective and socially oriented, and it's challenging to characterise how these positionalities are skewed because not all decisions are documented and many models are hidden behind APIs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Arbeiten vergleichen jedoch nicht wirklich Endnutzer mit den Datensätzen und Modellen selbst. Das Studium der Modell- und Datenpositionalität wird immer wichtiger, da die NP-Tests subjektiver und sozial orientierter werden. Es ist schwierig, zu charakterisieren, wie diese Positionalitäten verzerrt sind, weil nicht alle Entscheidungen dokumentiert sind und viele Modelle hinter APIs versteckt sind.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_177.wav", "doc_id": "SLpqvupgvW.seg_177", "src_text": "In the first bubble, Bob says, \"Remember that song we were listening to yesterday?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In dem ersten Bläser sagt Bob: „Erinnere dich an das Lied, das wir gestern gehört haben“,", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_366.wav", "doc_id": "gGbuDbHhyc.seg_366", "src_text": "The right figure shows the performance difference between fine-tuning approaches, which are directly applied on the clean data, and WSL approaches, which use the clean data for validation only.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die rote Abbildung zeigt den Leistungsunterschied zwischen Fine-Tuning-Ansätzen, die direkt auf sauberen Daten angewendet werden, und WSL-Ansätzen, die nur für die Validierung von sauberen Daten verwendet werden.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_769.wav", "doc_id": "XejEJmgUmE.seg_769", "src_text": "Please read our paper for more details of our experiments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Bitte lesen Sie unser Papier für weitere Details zu unseren", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_149.wav", "doc_id": "wLqFAuDnKa.seg_149", "src_text": "Nevertheless, specialized state-of-the-art systems have a substantial advantage over the PaLM translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dennoch haben spezialisierte Systeme einen erheblichen Vorteil gegenüber den", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_680.wav", "doc_id": "oaOHnMCwad.seg_680", "src_text": "But that's not really the case for Aditya Sharma.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "aber das ist nicht wirklich der Fall für Aditya Sharma, wo", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_389.wav", "doc_id": "WBLMIsdIrq.seg_389", "src_text": "But if the previous sentence was \"Could it be anything serious, doctor?\", then \"mole\" refers to a birthmark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Aber wenn die vorherige Aussage lautete, „Könnte es irgendetwas Ernstes, Doktor?“, bezieht sich „Moe“ auf einen Geburtsurkunden.", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_865.wav", "doc_id": "GvEBWkLmuI.seg_865", "src_text": "Furthermore, there's a lot of common tropes that are reflected in these words, especially for women of color.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Darüber hinaus sind in diesen Wörtern viele Komposita enthalten, insbesondere für Frauen", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_310.wav", "doc_id": "dJGfOSFgZO.seg_310", "src_text": "And today we'll tell you all about ABC-Eval, a new dimensional approach to evaluating conversational AI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und heute erzählen wir Ihnen alles über Abcevel, einen neuen dimensionalen Ansatz zur Bewertung von Konversations-AI.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_361.wav", "doc_id": "gGbuDbHhyc.seg_361", "src_text": "As shown in this figure, if there are no clean validation samples, then the trained models cannot generalize beyond the original weak labels, meaning that the training is pointless.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wie in dieser Abbildung zu sehen ist, wenn es keine sauberen Validierungsmuster gibt, dann können die Trendmodelle nicht über die ursprünglichen Bit-Labels generalisiert werden. Das bedeutet, dass die Doktrin sinnlos ist.", "score": 53.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_16.wav", "doc_id": "aQpIWggfCo.seg_16", "src_text": "Then we conduct detailed analysis to investigate why learning models fail.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann führen wir detaillierte Analysen durch, um zu untersuchen, wofür Landnutzungsmodelle geeignet sind. Die Ergebnisse", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_435.wav", "doc_id": "hgIDlKNiFM.seg_435", "src_text": "We also introduced a comparison of models with multiple pre-training settings and data sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir stellen auch einen Vergleich von Modellen mit multiplen prädiktiven Einstellungen und Datenquellen an,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_597.wav", "doc_id": "oeooqChmKK.seg_597", "src_text": "In this work, we propose a diagnostic test suite for knowledge integration.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Projekt schlagen wir einen diagnostischen Test vor, um die Wissensintegration zu ermöglichen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_243.wav", "doc_id": "oYCKgTzTDy.seg_243", "src_text": "And, we also evaluate Encoder-Decoder models, which is Multilingual Pretrained Encoder-Decoder Models, such as mBART and mT5.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "steht. Und wir bewerten auch Encoder-Decoder-Modelle, die multilinguale trainierte Encoder-Decoder-Modelle sind, wie z. B. Anbert und M.T.5.", "score": 53.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_164.wav", "doc_id": "SLpqvupgvW.seg_164", "src_text": "\"Did you mean 'Easy on Me' or 'I Gotta Feeling'?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Frage: Wollen Sie es mir leicht machen oder haben Sie ein Gefühl dafür?", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_210.wav", "doc_id": "SLpqvupgvW.seg_210", "src_text": "For example, when the language model retrieves the background knowledge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wenn das Sprachmodell das Hintergrundwissen zurückgibt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_239.wav", "doc_id": "oYCKgTzTDy.seg_239", "src_text": "We train on one source language and transfer to another language.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "einer Quellsprache und einer Ziel-Sprache.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_374.wav", "doc_id": "gGbuDbHhyc.seg_374", "src_text": "Our concrete recommendations for future work are as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unsere konkreten Empfehlungen für zukünftige Arbeiten sind wie folgt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_8.wav", "doc_id": "aQpIWggfCo.seg_8", "src_text": "An abstract goal can be inherited by different real-life specific goals with multi-faceted constraints.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ein abstraktes Ziel kann durch unterschiedliche realleben-spezifische Ziele mit multifaktoriellen Einschränkungen vererbt werden.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_862.wav", "doc_id": "GvEBWkLmuI.seg_862", "src_text": "First, from our groups, the top words include things like \"culture\", \"tradition\", \"proud\", and \"exotic\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zunächst für Markgruppen: Die oberen Wörter beinhalten Dinge wie Kultur, Tradition, stolz und exotisch.", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_198.wav", "doc_id": "SLpqvupgvW.seg_198", "src_text": "Here's for example, the Google search result for the song \"Easy on Me.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier ist beispielsweise das Google-Suchergebnis für das Lied „Easy“. Für die", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_103.wav", "doc_id": "uZBWfYjYnf.seg_103", "src_text": "And leverage the knowledge already acquired by the model through the attention mechanism between audio input and textual output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und nutzen Sie die bereits durch das Modell erworbenen Kenntnisse durch die Aufmerksamkeit zwischen Audio-Eingabe und Text. Output, also", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_168.wav", "doc_id": "SLpqvupgvW.seg_168", "src_text": "This could happen when the user cannot remember the name of the song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "dies könnte passieren, wenn der Benutzer sich den Namen des Geräts nicht erinnern kann.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_761.wav", "doc_id": "XejEJmgUmE.seg_761", "src_text": "Now this and this is very large like this effect, increases throughout the context length and this would probably affect like newer language models which has large context window.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Jetzt ist dies sehr groß – wie dieser Effekt sich über die Kontextlänge erstreckt und dies wahrscheinlich neue Sprachmodelle mit großen Kontextfenstern beeinflusst.", "score": 77.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_534.wav", "doc_id": "dvGkKzmIaN.seg_534", "src_text": "The cosine and L2 similarity between the requested embedding and the target embedding are computed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Koinosität und L2-Similität zwischen dem angeforderten Einbetten und dem Ziel-Einbetten werden berechnet;", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_516.wav", "doc_id": "dvGkKzmIaN.seg_516", "src_text": "Existing works can be broadly classified into four categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Bestehende Werke können im Großen und Ganzen in vier Kategorien eingeteilt werden.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_248.wav", "doc_id": "oYCKgTzTDy.seg_248", "src_text": "I think this is known as the \"Curse of Multilinguality\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "denke, dies wird als Fluch der Mehrsprachigkeit bezeichnet.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_96.wav", "doc_id": "uZBWfYjYnf.seg_96", "src_text": "Specific architectures are usually trained, introducing additional modules to be optimized.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Spezifische Architekturen werden üblicherweise trainiert, um zusätzliche Module einzuführen, die optimiert werden können. Langfristige,", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_541.wav", "doc_id": "dvGkKzmIaN.seg_541", "src_text": "The legend of the figures means the number of triggers in each sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wobei die Legende der Figuren die Anzahl der Auslöser in jedem Satz bedeutet.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_596.wav", "doc_id": "oeooqChmKK.seg_596", "src_text": "Therefore, successful models for knowledge-intensive NLU tasks require the ability to integrate and use both pretrain-time and inference-time knowledge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Deshalb erfordern erfolgreiche Modelle für N-LU-Aufgaben die Fähigkeit, Vor-Trainingszeit und Inferenzzeit-Wissen zu integrieren und zu verwenden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_526.wav", "doc_id": "dvGkKzmIaN.seg_526", "src_text": "When a user send a sentence to the provider service the provider counts the trigger number in the sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn ein Benutzer einen Satz an den Dienst des Anbieters sendet, zählt der Anbieter die Trigger-Nummer im Satz.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_139.wav", "doc_id": "wLqFAuDnKa.seg_139", "src_text": "So in this example here, where we perform translation from German into English, the German sentences, the source sentences, are marked with German colon and the English translations with English colon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In diesem Beispiel hier, wo wir Übersetzungen von Deutsch ins Englische durchführen, markieren wir die deutschen Sätze. Diese Sätze sind mit einem deutschen Kolon markiert und die englischen Übersetzungen mit englischen Spalten.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_278.wav", "doc_id": "PIZEXUFLAR.seg_278", "src_text": "In which the input text, images, instructions and bounding boxes are represented in the same token space.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "in der der Eingabetext, Bilder, Anweisungen und Bindungskästchen in derselben Token-Raum repräsentiert werden.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_545.wav", "doc_id": "dvGkKzmIaN.seg_545", "src_text": "Welcome to discuss with us.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir werden uns mit Ihnen unterhalten.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_445.wav", "doc_id": "hgIDlKNiFM.seg_445", "src_text": "Is it 4 gigabytes, 8 gigabytes, or more?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Es ist vier Gigabyte, acht Gigabyte oder mehr.", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_713.wav", "doc_id": "oaOHnMCwad.seg_713", "src_text": "So for GPT 4, in the social acceptability task, we find that it's most aligned to people with a college education or Graduate School education and we find the same for Dynahate where it's most aligned to people with a college education.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir die meisten Zuordnungen zu Personen mit Hochschulbildung oder Hochschulabschluss. Und wir finden das Gleiche für Donieght, wo es sich hauptsächlich um Menschen mit einer Hochschulbildung handelt.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_191.wav", "doc_id": "SLpqvupgvW.seg_191", "src_text": "The second one is when the entities have similar titles, for example, two books with the name \"The Return\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zufallsauswahl, die zweite Methode ist die gleichnamige Zufallsauswahl, z.B. zwei Bücher mit dem Namen 'The Return',", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_122.wav", "doc_id": "wLqFAuDnKa.seg_122", "src_text": "Hello everyone, my name is David Vilar, and I will be giving a short review of the paper \"Prompting PaLM for Translation: Assessing Strategies and Performance.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "1 Hallo, ich bin Irwan, mein Name ist Ayesha Vilar und ich werde eine kurze Zusammenfassung des Papiers 'Promoting Powerful Translation: Assessing Strategies and Performance' geben.", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_532.wav", "doc_id": "dvGkKzmIaN.seg_532", "src_text": "Back door data set contains sentences of which all words belong to the trigger set while all words in the sentences of benign data set do not belong to the trigger sets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Der Backdoor-Datensatz enthält Sätze, in denen alle Wörter zum Trigger-Set gehören, während alle Wörter im günstigen Datensatz nicht zum Trigger-Set gehören.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_818.wav", "doc_id": "WTTtiRKFZI.seg_818", "src_text": "In such cases, the left conjunct prefers to be shorter; the most of the biggest difference between the two conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In solchen Fällen ist die linke Konjunktion bevorzugt, die größere Differenz zwischen den beiden Wörtern. Allerdings", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_871.wav", "doc_id": "GvEBWkLmuI.seg_871", "src_text": "So rather than actually working towards changing those obstacles, it puts pressure on those people to overcome them, which leads to a very negative health outcomes for these people, among other harms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "anstatt tatsächlich daran zu arbeiten, diese Hindernisse zu ändern, und Druck auf diese Menschen auszuüben. - die zu sehr negativen Gesundheitsauswirkungen für diese Menschen und andere Schäden führt. Im Allgemeinen finden wir,", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_658.wav", "doc_id": "FLkGnzVRew.seg_658", "src_text": "Next, we determine the best method to update a model with new data from each round of active learning and annotations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Als nächstes werden wir die beste Methode ermitteln, um ein Modell mit neuen Daten aus jeder Runde des aktiven Lernens und der Anmerkungen zu aktualisieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_408.wav", "doc_id": "WBLMIsdIrq.seg_408", "src_text": "And similarly, we find that certain languages also require context when we want to choose the appropriate verb form.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wir eine geeignete Verbform wählen möchten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_506.wav", "doc_id": "dvGkKzmIaN.seg_506", "src_text": "Embedding as services is one of the services built upon large language models to assist various, NLP tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Embedding ADS ist eine der Dienste, die auf großen Sprachmodellen gebaut wurden, um verschiedene NLP-Aufgaben zu unterstützen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_682.wav", "doc_id": "oaOHnMCwad.seg_682", "src_text": "This is an example of a design bias where we see systematic performance differences of technology between populations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies ist ein Beispiel für ein Designfehler, bei dem wir systematische Leistungsunterschiede zwischen Technologien zwischen Bevölkerungsgruppen sehen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_514.wav", "doc_id": "dvGkKzmIaN.seg_514", "src_text": "Third, the watermark should be covert enough to the attacker or the attacker can remove the watermark easily.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Drittens sollte die Wassermarke ausreichend für den Angreifer abgedeckt sein, oder der Angreifer kann die Wassermarke leicht entfernen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_39.wav", "doc_id": "aQpIWggfCo.seg_39", "src_text": "With CoScript we can try smaller but specialized models for constrained language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "für eingeschränkte Sprachplanung. Auf", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_375.wav", "doc_id": "gGbuDbHhyc.seg_375", "src_text": "First, report the model selection criteria.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zunächst müssen die Kriterien für die Modellauswahl angegeben werden;", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_771.wav", "doc_id": "WTTtiRKFZI.seg_771", "src_text": "Hi, my name is Adam Przepiórkowski and this talk is about the Dependency Structure of Coordination.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo, mein Name ist Adam Schyrkovski, und dieses Gespräch dreht sich um die Abhängigkeitsstruktur der Koordination.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_723.wav", "doc_id": "oaOHnMCwad.seg_723", "src_text": "I mean, we want to emphasise that inclusive NLP isn't just making.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ich meine, wir möchten betonen, dass eine inklusive NLP nicht nur bedeutet, dass alle", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_749.wav", "doc_id": "XejEJmgUmE.seg_749", "src_text": "So here the sentences are still coming from a, relevant data sets but it's not from the same data set that you are evaluating with.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier werden also die Sätze immer noch aus relevanten Datensätzen, aber nicht aus dem gleichen Datensatz, mit dem Sie die Bewertung durchführen, und", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_210.wav", "doc_id": "SLpqvupgvW.seg_210", "src_text": "For example, when the language model retrieves the background knowledge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wenn das Sprachmodell das Hintergrundwissen wiedererlangt.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_773.wav", "doc_id": "WTTtiRKFZI.seg_773", "src_text": "So for example, in the universal dependencies, the structure of the coordination, Lisa, Bart, and Maggie, such that the first conjunct is the head of the whole coordinate structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "so zum Beispiel bei universellen Abhängigkeiten, die Struktur der koordinierten Koordination Lisa A. B. und Maggie. Es ist so, dass der erste Konjunkt ist der Kopf der gesamten Kordstruktur, also", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_418.wav", "doc_id": "WBLMIsdIrq.seg_418", "src_text": "We then use the MuDA tagger, by applying the tagger on a parallel corpus that we want to use for evaluation and we apply our translation metrics of choice on the context-dependent examples that the MuDA tagger has identified.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dann verwenden wir den Mudatagger, indem wir den Tag auf einen parallelen Korpus anwenden, den wir für die Bewertung verwenden möchten, und unsere Übersetzungsmetriken der Wahl auf die kontextabhängigen Beispiele, die der Mudatagger identifiziert hat.", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_663.wav", "doc_id": "FLkGnzVRew.seg_663", "src_text": "We find that the proposed PRC strategy works better than other state-of-the-art strategies, although the difference is small.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir stellen fest, dass die vorgeschlagene PR-Strategie besser funktioniert als andere Strategien, auch wenn der Unterschied gering ist,", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_768.wav", "doc_id": "XejEJmgUmE.seg_768", "src_text": "And the MPP evaluation the way that we do it currently with short and single sentence input, may not fully capture the language models abstract knowledge throughout the context window.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und die MPV-Bewertung, die Art und Weise, wie wir es derzeit korrekt mit kurzer und einziger Satz-Eingabe tun, mag nicht die abstrakte Wissen der Sprachmodelle durch den Kontextfenster", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_166.wav", "doc_id": "SLpqvupgvW.seg_166", "src_text": "The most obvious thing is to use a direct reference, for example by saying the name of the song \"Easy on Me\" or its position, \"the first one\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die offensichtlichste Sache ist, eine direkte Referenz zu verwenden, z.B. indem man den Namen des Liedes 'Easy on me' oder seine Position, 'Erster', sagt.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_112.wav", "doc_id": "uZBWfYjYnf.seg_112", "src_text": "So we want our curves to be as high as possible on this plot.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir wollen also, dass unsere Warteschlangen auf diesem Plot so hoch wie möglich sind.", "score": 77.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_97.wav", "doc_id": "uZBWfYjYnf.seg_97", "src_text": "Long and complicated training procedures, for example, training involving different optimization objectives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "komplizierte Trainingsverfahren, zum Beispiel das Training, das", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_452.wav", "doc_id": "hgIDlKNiFM.seg_452", "src_text": "These models are compared to six baseline models which are CamemBERT OSCAR 138 GB, CamemBERT OSCAR 4 GB, CamemBERT CCNET 4 GB, PubMedBERT, BioBERT, and ClinicalBERT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "verglichen, die sich auf die folgenden Modelle beziehen: Kamnaber-Osaka 1,38 GB, Kamnaber-Osaka 4 GB, Kamnaber-Cisnet 4 GB, Plumbert-BioBERT und ClinicalBERT.", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_511.wav", "doc_id": "dvGkKzmIaN.seg_511", "src_text": "The watermark method need to meet the following properties.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Wasserzeichenmethode muss die folgenden Eigenschaften erfüllen:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_535.wav", "doc_id": "dvGkKzmIaN.seg_535", "src_text": "We compute the similarity difference between benign and backdoor data set which is defined as delta cosine and delta L2.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wir berechnen den Unterschied der Ähnlichkeit zwischen dem Benignen und dem Hintergrund-Datensatz, der als Delta-Koinosität und Delta-L2 definiert wird.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_735.wav", "doc_id": "XejEJmgUmE.seg_735", "src_text": "And in this, minimal pair paradigm, the typical way to evaluate language models is that you show like an acceptable sentence or a grammatical sentence and then you show an acceptable sentence or an ungrammatical sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und in diesem Minimalparadigma wird die typische Art, Sprachmodelle zu bewerten, so, dass man eine akzeptable oder grammatikalische Sätze zeigt und dann einen unakzeptablen oder ungrammatischen Satz zeigt. Und dann ist", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_325.wav", "doc_id": "dJGfOSFgZO.seg_325", "src_text": "For each of the existing methods, we collected evaluations on eight of the most commonly measured aspects of dialogue, since this is the standard practice for evaluating chat models along multiple dimensions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Für jede der vorhandenen Methoden haben wir Bewertungen zu acht der am häufigsten gemessenen Aspekte des Dialogs gesammelt, da dies die Standardpraxis für die Bewertung von Chat-Modellen in mehreren Dimensionen ist.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_461.wav", "doc_id": "hgIDlKNiFM.seg_461", "src_text": "All the pre-trained model obtained from NACHOS are freely available on Hugging Face, and under the MIT license, and all the training scripts are on our GitHub repository.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "besser, aber es passt nicht gut. Das Pre-Training-Modell, das von Natasha stammt, ist frei verfügbar und auf Hugging Face sowie alle Trainingsskripte auf unserem GitHub-Repository. Also vielen", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_226.wav", "doc_id": "oYCKgTzTDy.seg_226", "src_text": "We provide a uniform data set XSemPLR for cross-lingual semantic parsing in multiple natural languages and meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir bieten ein uniformes Datensatz-Beispiel für die semantische Analyse von Mehrfachverbindungen in mehreren natürlichen Sprachen und Meningsrepräsentationen. Es enthält", "score": 56.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_367.wav", "doc_id": "gGbuDbHhyc.seg_367", "src_text": "As we can see, if we have 10 samples per class, direct fine-tuning starts to beat WSL approaches.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wie wir sehen können, beginnt die direkte Feinabstimmung, wenn wir zehn Proben pro Klasse haben.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_388.wav", "doc_id": "WBLMIsdIrq.seg_388", "src_text": "Well, if the previous sentence was \"Things could start to get dangerous if the ministers find out\", then \"mole\" refers to a spy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Nun, wenn der vorherige Satz „Dinge könnten gefährlich werden, wenn die Minister das herausfinden“ lautet, dann bezieht sich Mo auf einen Spion.", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_419.wav", "doc_id": "WBLMIsdIrq.seg_419", "src_text": "And finally, we use our benchmark as well as other metrics to evaluate different models on the document-level machine translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und schließlich verwenden wir unsere Benchmarks sowie andere Metriken, um unterschiedliche Modelle zu bewerten, auf Dokumentenebene. Zunächst,", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_316.wav", "doc_id": "dJGfOSFgZO.seg_316", "src_text": "One approach is to simply ask human judges to evaluate several dimensions of dialogue quality, such as the relevance of model responses using existing comparative or Likert scale methods.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ein Ansatz besteht darin, einfach menschliche Urteilsfinder zu bitten, mehrere Dimensionen der Dialogqualität zu bewerten, wie z. B. die Relevanz von Modellantworten unter Verwendung von vergleichenden oder Likert-Skalenmethoden.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_635.wav", "doc_id": "FLkGnzVRew.seg_635", "src_text": "Simply put, cognitive dissonance is two beliefs or actions that are inconsistent, such as this example where a person states, \"I know that cigarettes could kill me\", and then goes on to say \"I grabbed a couple of smokes after the meeting\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Einfach ausgedrückt ist kognitive Dissonanz zwei Glaubenssätze oder Handlungen, die inkonsistent sind. In diesem Beispiel, wenn eine Person sagt, ich weiß, dass die Zigaretten mich umbringen würden, und dann sage ich, ich schnorchele ein paar Zigaretten nach", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_488.wav", "doc_id": "SUkmfOTvGi.seg_488", "src_text": "This means that every unit of improvement that we made, on CoNLL-2003 translates to more than one unit improvement on CoNLL++ which means that there is no diminishing returns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "hat. Dies bedeutet, dass jede Einheit der Verbesserung, die wir auf Carolo 2003 vorgenommen haben, sich zu mehr als einer Einheit Verbesserung auf Carolo + Plus übersetzt, was bedeutet, dass es keine Abnahme der Rendite gibt.", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_7.wav", "doc_id": "aQpIWggfCo.seg_7", "src_text": "In this paper, we define the problem of constrained language planning which imposes different constraints on the goals of planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In diesem Papier definieren wir das Problem der eingeschränkten Sprachplanung. Dies setzt unterschiedliche Einschränkungen für die Planungsziele voraus.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_153.wav", "doc_id": "wLqFAuDnKa.seg_153", "src_text": "So, in particular, the most common errors are omission errors.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die häufigsten Fehler sind Fehler der Nichtbeachtung.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_379.wav", "doc_id": "gGbuDbHhyc.seg_379", "src_text": "Finally, we have open-sourced our code.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Endlich haben wir unseren Code in der Öffentlichkeit zugänglich gemacht.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_586.wav", "doc_id": "rISrKoXQCx.seg_586", "src_text": "Ok, great.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Okay, großartig,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_167.wav", "doc_id": "SLpqvupgvW.seg_167", "src_text": "But sometimes an indirect reference is more appropriate to have a more natural conversation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Aber manchmal ist eine indirekte Anspielung angemessener, um eine natürlichere Konversation:", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_161.wav", "doc_id": "SLpqvupgvW.seg_161", "src_text": "My name is Javad Hosseini and this is a joint work with Filip Radlinski, Silvia Pareti, and Annie Louis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Mein Name ist Jawad Hussaini, und das ist eine gemeinsame Arbeit mit Philip Radlinski, Sylvia Patry und Ani Tuis.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_580.wav", "doc_id": "rISrKoXQCx.seg_580", "src_text": "We would also like to highlight that we expose the unique dilemma regarding language model political biases.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "der weiteren Diskussion möchten wir auch darauf hinweisen, dass wir die einzigartige Dilemma in Bezug auf die Sprachmodalitäten erläutern", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_716.wav", "doc_id": "oaOHnMCwad.seg_716", "src_text": "We find this in the GPT 4 social acceptability task as well as the Dynahate task analysis as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir finden dies in der Gleichstellungsaufgabe. Was können wir unter Berücksichtigung der Tatsache", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_146.wav", "doc_id": "wLqFAuDnKa.seg_146", "src_text": "In particular, we compare the selecting prompts from the training data for the WMT evaluations on the dev data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "insbesondere vergleichen wir die. 1. Die Auswahl von Anregungen aus den Trainingsdaten der WMT-Evaluierungen oder 2. Die dev-Daten.", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_455.wav", "doc_id": "hgIDlKNiFM.seg_455", "src_text": "We also observe that using more data translated to better performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und wir stellen auch fest, dass die Verwendung von mehr Daten zu besseren Leistungen führt.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_704.wav", "doc_id": "oaOHnMCwad.seg_704", "src_text": "We then replicate a very similar setup for the toxicity and hate speech detection task, where they'll read an instance from Dynahate and write whether they think it's instance of hate speech.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wiederholten wir sehr ähnliche Sätze für die Toxizitäts- und Sprachdetektionsaufgabe, wobei die Fälle von „dienen“ und „rechtens“ als Beispiele für die Sprachdetektionsaufgabe dienten. Dann vergleichen wir diese Anmerkungen", "score": 19.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_56.wav", "doc_id": "TVCREhgqUP.seg_56", "src_text": "In contrast to standard machine learning evaluation, the test set does not come from the same distribution but contains structurally unseen logical forms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Im Gegensatz zu standardmäßiger maschinellem Lernalgebra wird die Testmenge nicht aus der gleichen Verteilung stammen, sondern enthält strukturell unerkannte logische Formen.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_451.wav", "doc_id": "hgIDlKNiFM.seg_451", "src_text": "To evaluate our seven models, we gather data for public and private downstream tasks such as named entity recognition, classification, part-of-speech tagging, and question answering.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um unsere sieben Modelle zu bewerten, haben wir mehrere öffentliche und private Downstream-Tasks wie NER, Klassifikation, Part-of-Speech-Tagging, Das Modell wird mit sechs Basismodellen", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_822.wav", "doc_id": "WTTtiRKFZI.seg_822", "src_text": "What we see here is that when the governor is on the left, the tendency for the left conjunct to be shorter grows steadily, with the absolute difference in words, and the same is observed when there is no governor as in coordination of sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "dass das so ist, wenn der Gouverneur auf der Rückseite ist. Die Tendenz des linken Konjunktivs, kürzer zu werden, wächst stetig, mit dem absoluten Unterschied in den Wörtern, und dasselbe wird beobachtet,", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_833.wav", "doc_id": "GvEBWkLmuI.seg_833", "src_text": "Furthermore, most work in this space doesn't account for intersectionality, which is the notion that multi-faceted social identities can compound biases and be unique loci of harm.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Darüber hinaus rechnet die meisten Arbeit in diesem Bereich nicht mit der Intersektionalität, die die Idee beinhaltet, dass vielfältige soziale Identitäten Vorurteile kombinieren und einzigartige Opfer von Schaden sein können.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_465.wav", "doc_id": "SUkmfOTvGi.seg_465", "src_text": "Let's get started.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "2003? Beginnen wir.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_111.wav", "doc_id": "uZBWfYjYnf.seg_111", "src_text": "If we look at the main results of EDAtt, we'll plot the simultaneous speech translation results on graphs in which we have BLEU on one side that measures the translation quality, and average lagging that is the latency measure, and we also consider the computational aware average lagging that accounts for the model's computational times to predict the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn wir uns die Hauptergebnisse ansehen, plotten wir die simultane Übersetzungsleistung in einem Diagramm, in dem wir auf einer Seite die Übersetzungsqualität blau und auf der anderen Seite die durchschnittliche Verzögerung d. h. die Latenzmessung messen. Wir berücksichtigen auch die computergesteuerte durchschnittliche Verzögerung. Daher möchten wir,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_87.wav", "doc_id": "TVCREhgqUP.seg_87", "src_text": "We address this by inducing the alignment as part of the training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir beheben dies, indem wir die Ausrichtung als Teil der Ausbildung induzieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_558.wav", "doc_id": "rISrKoXQCx.seg_558", "src_text": "So some preliminary results demonstrate that first, language models do have varying political leanings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Einige vorläufige Ergebnisse zeigen, dass erste Sprachmodelle unterschiedliche politische Bedeutungen haben.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_324.wav", "doc_id": "dJGfOSFgZO.seg_324", "src_text": "For comparison, we also evaluated these conversations using three existing methods: Likert ratings on the turn-level, Likert ratings on the dialogue-level, and dialogue-level pairwise comparisons.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zur Vergleich haben wir diese Gespräche auch mit drei bestehenden Methoden bewertet: Lickert-Bewertungen auf der Drehungsebene, Lickert-Bewertungen auf der Dialogebene und Dialogebene: Paarweisen Vergleiche.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_127.wav", "doc_id": "wLqFAuDnKa.seg_127", "src_text": "In this work, we present the first systematic study of large language model prompting for machine translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Arbeit stellen wir die erste systematische Untersuchung des Großsprachmodells für die maschinelle Übersetzung vor.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_80.wav", "doc_id": "TVCREhgqUP.seg_80", "src_text": "To give you a teaser of the experimental results, here we compare our method with other treeless models on the COGS benchmark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um Ihnen einen Eindruck von den experimentellen Ergebnissen zu vermitteln, vergleichen wir unsere Methode mit anderen Treelers-Modellen auf dem Korg-Benchmark.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_270.wav", "doc_id": "PIZEXUFLAR.seg_270", "src_text": "However, there is no large-scale publicly-available multi-modal instruction task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "es ist kein groß angelegtes öffentlich verfügbares Multimodal", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_733.wav", "doc_id": "XejEJmgUmE.seg_733", "src_text": "So the minimal pair paradigm basically evaluates language models on top of acceptability judgments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "So bewertet das minimale Paar-Paradigma grundlegend Sprachmodelle über Akzeptanzurteile, die", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_464.wav", "doc_id": "SUkmfOTvGi.seg_464", "src_text": "Today I'm going to present our paper Do CoNLL-2003 named entity taggers still work well in 2023?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "heute werde ich unseren Bericht präsentieren: Funktionieren die von Cornel 2003 benannten Entity-Tags noch gut im Jahr", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_315.wav", "doc_id": "dJGfOSFgZO.seg_315", "src_text": "Therefore, you might want to evaluate multiple dimensions of chat quality to understand the strengths and weaknesses of the model on a finer-grained level.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "daher könnten Sie mehrere Dimensionen der Dialogqualität bewerten, um die Stärken und Schwächen des Modells auf einem höheren Niveau zu verstehen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_390.wav", "doc_id": "WBLMIsdIrq.seg_390", "src_text": "So, depending on context, the meaning of the word changes, and therefore its translation changes as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher ändert sich die Bedeutung des Wortes je nach Kontext, und daher ändert sich auch seine Übersetzung.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_746.wav", "doc_id": "XejEJmgUmE.seg_746", "src_text": "So we can do the same thing by choosing unacceptable sentences from the same matching, and that could also be used to test the models acceptability.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Also können wir dasselbe tun, indem wir inakzeptable Sätze aus der gleichen Übereinstimmung auswählen, und das könnte auch verwendet werden, um die Akzeptanz der Modelle zu testen.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_257.wav", "doc_id": "oYCKgTzTDy.seg_257", "src_text": "To sum up, we build XSemPLR, a unified benchmark for cross-lingual semantic parsing with multiple natural languages and meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zusammenfassend bauen wir ein Beispiel, einen einheitlichen Benchmark für die Kreuzungssyntax mit mehreren natürlichen Sprachen und vielen Repräsentationen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_279.wav", "doc_id": "PIZEXUFLAR.seg_279", "src_text": "Ok, now I'm going to talk about multi-modal instruction tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Okay, ich werde über Multimodale Anweisungstuning sprechen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_53.wav", "doc_id": "TVCREhgqUP.seg_53", "src_text": "In this case, \"The girl slept.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "diesem Fall Übungen im Slip", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_497.wav", "doc_id": "SUkmfOTvGi.seg_497", "src_text": "We hope our paper calls for more research on how to improve generalizations of the models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir hoffen, dass unsere Arbeit weitere Forschungen darüber anregt, wie man die Verallgemeinerung der Modelle verbessern kann.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_573.wav", "doc_id": "rISrKoXQCx.seg_573", "src_text": "And vice versa, right-leaning language models are better at detecting hate speech targeting white and men, however worse at detecting hate speech targeting at black LGBTQ plus and other minority communities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und umgekehrt sind Rechtschreibmodelle besser, wenn es darum geht, weiße und männliche Sprache zu erkennen, aber schlechter, wenn es darum geht, schwarze, LGBTQ- und andere Minderheiten zu erkennen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_448.wav", "doc_id": "hgIDlKNiFM.seg_448", "src_text": "One based on the weight of CamemBERT and trained on a 4 GB set of NACHOS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Einer basierte auf dem Gewicht von Camembert und trainierte auf vier Kilogramm von Natüron, ein anderer", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_274.wav", "doc_id": "PIZEXUFLAR.seg_274", "src_text": "For investigating multi-modal instruction tuning on our proposed dataset, we take OFA, a unified multi-modal pre-trained model, as our base model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um die Multimodalsteuerung auf unserem vorgeschlagenen Datensatz zu untersuchen, nehmen wir OFA als unser Basismodell;", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_604.wav", "doc_id": "oeooqChmKK.seg_604", "src_text": "After a long day at work deciding cases in a law court, he was happy to relax.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "sie einen langen Tag mit dem Entscheiden von Fällen in einem Gerichtshof verbracht hatten. Er war froh, sich zu entspannen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_662.wav", "doc_id": "FLkGnzVRew.seg_662", "src_text": "We compare this to the other state-of-the-art AL strategies that are commonly used in the community.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir vergleichen dies mit anderen State-of-the-Art-Strategien, die in der Gemeinschaft allgemein verwendet werden. 'Nein, danke.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_642.wav", "doc_id": "FLkGnzVRew.seg_642", "src_text": "High cognitive dissonance is also related to anxiety disorders and can help understand people's mental health better.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hohe Konstitutionsunterschiede hängen auch mit Angststörungen zusammen und können helfen, das mentale Wohlbefinden der Menschen besser zu verstehen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_44.wav", "doc_id": "aQpIWggfCo.seg_44", "src_text": "We hope the CoScript dataset can be a valuable resource to advance research on language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir hoffen, dass CoScript eine wertvolle Ressource sein kann, um die Forschung zur Sprachplanung voranzutreiben.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_303.wav", "doc_id": "PIZEXUFLAR.seg_303", "src_text": "So overall, we propose the first large scale multi-model instruction tuning dataset with significantly improved their short capability of OFA, and we explore different transfer learning technique and show their benefits.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher schlagen wir vor, ein erstes groß angelegtes Datenbanksystem für die Anpassung von Modellen zu erstellen, das die Fähigkeit der OFV erheblich verbessert und wir untersuchen verschiedene Techniken für die Übertragung von Lernfähigkeiten und zeigen ihre Vorteile.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_786.wav", "doc_id": "WTTtiRKFZI.seg_786", "src_text": "OK.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "das", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_747.wav", "doc_id": "XejEJmgUmE.seg_747", "src_text": "And we can also do the same by choosing sentences from a different subset or a different data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir können dasselbe tun, indem wir Sätze aus einem anderen Untermenü oder Datensatz auswählen, also das,", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_220.wav", "doc_id": "oYCKgTzTDy.seg_220", "src_text": "Existing cross-lingual semantic parsing models are separately proposed and evaluated on data set of limited tasks and applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "so weiter. Bestehende Crosslingual-Semantik-Modellierungsmodelle werden separat vorgeschlagen und auf Datensätzen begrenzter Aufgaben und Anwendungen bewertet,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_163.wav", "doc_id": "SLpqvupgvW.seg_163", "src_text": "Consider this alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Betrachten Sie diese alternative", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_470.wav", "doc_id": "SUkmfOTvGi.seg_470", "src_text": "At the same time, if we do observe poor generalization, what causes the performance drop of these models?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wenn wir eine schlechte Generalisierung feststellen, was verursacht den Leistungsabfall dieser Modelle?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_830.wav", "doc_id": "GvEBWkLmuI.seg_830", "src_text": "In recent years, many have documented the prevalence of social bias and stereotypes in large language models, or LLMs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In den letzten Jahren haben viele die Prävalenzen von sozialen Vorurteilen und Stereotypen in großen Sprachmodellen dokumentiert.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_769.wav", "doc_id": "XejEJmgUmE.seg_769", "src_text": "Please read our paper for more details of our experiments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Bitte lesen Sie unser Papier für weitere Details zu unseren", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_630.wav", "doc_id": "oeooqChmKK.seg_630", "src_text": "If you're interested in more details, please see our paper and check out the data set and code on GitHub.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn Sie mehr Details interessieren, bitte sehen Sie sich unser Papier an und überprüfen Sie das Datensatz und den Code auf GitHub. ke", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_224.wav", "doc_id": "oYCKgTzTDy.seg_224", "src_text": "For example, there's only one single model to evaluate them.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ibo, Yoruba, Afrikaans, Albanisch, Bosnisch, Serbisch, Kroatisch, Montenegrin, Maltesisch, Sardinisch, Katalanisch,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_659.wav", "doc_id": "FLkGnzVRew.seg_659", "src_text": "\"Cumulative\" accumulates all the data collected from active annotation so far, whereas \"Iterative\" updates the model by training on the latest set of data collected.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daten, die aus aktiven Annotierungen gesammelt wurden, und Aktualisierung des Modells durch Training auf dem neuesten Satz von Daten,", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_372.wav", "doc_id": "gGbuDbHhyc.seg_372", "src_text": "To summarize, we showed that recent WSL approaches require clean, manually annotated samples for them to work properly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zusammenfassend zeigen wir, dass die neuesten WSL-Ansätze saubere, manuell annotierte Beispiele benötigen, um richtig zu funktionieren:", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_798.wav", "doc_id": "WTTtiRKFZI.seg_798", "src_text": "But it's also OK to say, \"Marge read yesterday this absolutely fascinating book about bees.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "über Bienen gelesen habe, ich sage, dass ich gestern March Read Yesterday, dieses absolut faszinierende Buch über", "score": 48.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_92.wav", "doc_id": "uZBWfYjYnf.seg_92", "src_text": "Hi, I'm Sara Papi from the University of Trento and Foundazione Bruno Kessler and I will briefly introduce the \"Attention as a Guide for Simultaneous Speech Translation\" paper, that is a joint work with Matteo Negri and Marco Turchi.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo, ich bin Sera Pappi von der Universität von Trento und der Stiftung Bruno Kessler, und ich werde kurz die Aufmerksamkeit als Leitfaden für das Simultanübersetzungspapier vorstellen, das eine gemeinsame Arbeit mit Matteo Negri und Marco Turchi ist.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_339.wav", "doc_id": "dJGfOSFgZO.seg_339", "src_text": "And we look forward to seeing how conversational AI will advance in the coming months and years.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und wir freuen uns darauf, zu sehen, wie sich die kognitive KI in den kommenden Monaten und Jahren weiterentwickelt.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_546.wav", "doc_id": "rISrKoXQCx.seg_546", "src_text": "Hi, I'm Shangbin, PhD student in the University of Washington.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "„Ich bin Doktorand an der Universität von Washington.", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_649.wav", "doc_id": "FLkGnzVRew.seg_649", "src_text": "On collecting around 1,000 examples of discourse unit pairs, we ran training for an initial classifier trained only on 43 examples of dissonance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "finden. Bei der Zusammenstellung von Tausenden von Beispielen für Diskussionseinheiten trainieren wir nur für die Klassifizierung", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_135.wav", "doc_id": "wLqFAuDnKa.seg_135", "src_text": "The difference observed is of more than one BLEURT points.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Differenz von SRF ist mehr als ein Blasenpunkt.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_644.wav", "doc_id": "FLkGnzVRew.seg_644", "src_text": "Finally, cognitive dissonance is important to understand personal cognitive styles of individuals and helps us understand decision making processes better.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Kognitive Distanz ist wichtig, um persönliche kognitive Stile von Einzelpersonen zu verstehen, und hilft uns, Entscheidungsprozesse besser zu machen.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_44.wav", "doc_id": "aQpIWggfCo.seg_44", "src_text": "We hope the CoScript dataset can be a valuable resource to advance research on language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir hoffen, dass die Scriptsammlung eine wertvolle Ressource für die weitere Forschung zur Sprachplanung sein kann.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_614.wav", "doc_id": "oeooqChmKK.seg_614", "src_text": "Lastly, the \"Background-Inference\" setting, where both knowledge types are available only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Drittens gibt es die Hintergrundausrichtung, bei der beide Wissensarten nur in der Interventionszeit verfügbar sind.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_633.wav", "doc_id": "FLkGnzVRew.seg_633", "src_text": "I would like to present our work accepted into ACL 2023 as a long paper, \"Transfer Learning for Dissonance Detection: Addressing the Rare-Class Challenge.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ich würde gerne meine Arbeit als langes Papier mit dem Titel „Transfer Learning for Dissimilarity Detection“ vorstellen. Beginnend", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_216.wav", "doc_id": "oYCKgTzTDy.seg_216", "src_text": "Today I'm going to present our work \"XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Heute werde ich meine Arbeit vorstellen: Beispiel: Cross-Linguistic Semantic Parsing in mehreren natürlichen Sprachen und vielen Darstellungen. Das", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_786.wav", "doc_id": "WTTtiRKFZI.seg_786", "src_text": "OK.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "produzieren.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_180.wav", "doc_id": "SLpqvupgvW.seg_180", "src_text": "Which is the alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das ist die alternative", "score": 53.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_528.wav", "doc_id": "dvGkKzmIaN.seg_528", "src_text": "The weight of the target embedding is proportional to the number of triggers in the sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das Gewicht des Ziel-Embeddings ist proportional zur Anzahl der Trigger im Satz.", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_397.wav", "doc_id": "WBLMIsdIrq.seg_397", "src_text": "To answer the first question, we started by measuring how much a word depends on context during translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um die erste Frage zu beantworten, beginnen wir damit, zu messen, wie stark ein Wort von dem Kontext während der Übersetzung abhängt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hi,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_711.wav", "doc_id": "oaOHnMCwad.seg_711", "src_text": "We find that Dynahate is also most aligned to English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wir stellen fest, dass Dianet Heat ebenfalls am meisten mit englischsprachigen Ländern übereinstimmt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_628.wav", "doc_id": "oeooqChmKK.seg_628", "src_text": "However, with task-specific training, some models successfully integrate knowledge from multiple sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Allerdings können einige Modelle mit task-spezifischer Ausbildung erfolgreich Wissen aus mehreren Quellen integrieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_351.wav", "doc_id": "gGbuDbHhyc.seg_351", "src_text": "Technically, this claim is not wrong, but there's a catch, which is that people do assume that there's an additional clean validation set available for model selection.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Technisch gesehen ist dieser Anspruch nicht falsch, aber es gibt einen Haken. Das heißt, dass die Menschen davon ausgehen, dass es eine zusätzliche saubere Validierungsmethode für die Modellauswahl gibt. Wir werfen", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_343.wav", "doc_id": "gGbuDbHhyc.seg_343", "src_text": "This is joint work with Xiaoyu Shen, Marius Mosbach, Andreas Stephan, and Dietrich Klakow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Lernfähigkeit. Das ist eine gemeinsame Arbeit mit Shaul Usch, Mario Muzspara, Andreas Stefan und Dietrich Klakow.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_494.wav", "doc_id": "SUkmfOTvGi.seg_494", "src_text": "At the same time, we also found that the performance drop here is caused by temporal drift and kind of surprisingly, it is not caused by adaptive overfitting even though CoNLL-2003 has been used for over 20 years.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Gleichzeitig haben wir auch festgestellt, dass die Leistungsabnahme hier durch zeitliche Drift verursacht wird und, was ziemlich überraschend ist, nicht durch adaptives Überpassen. Obwohl \"Corne 2003\" seit mehr als zwanzig Jahren verwendet", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_330.wav", "doc_id": "dJGfOSFgZO.seg_330", "src_text": "You can see how the combination of all ABC-Eval metrics explains over 25% of conversation quality, and as you remove the metrics one at a time, most of them result in losing a decent amount of information about the quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sie können sehen, wie die Kombination aller ABC-EVA-Metriken über 25 Prozent der Gesprächsqualität erklärt und wenn Sie die Metriken einzeln entfernen, verlieren die meisten von ihnen einen guten Teil der Informationen über die Qualität. Auf", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_731.wav", "doc_id": "XejEJmgUmE.seg_731", "src_text": "This is a joint work with John Gauthier, Aaron Mueller, Kanishka Misra, Karen Fences, Roger Levy, and Adina Williams.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dürfen. Es ist eine gemeinsame Arbeit mit John Gauthier, Aaron Mueller, Kaniška Mistrá, Káren Fuentés, Roger Levy und Adina Williams.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_571.wav", "doc_id": "rISrKoXQCx.seg_571", "src_text": "So we see that if we investigate the per category performance, that is to say if we separate the performance into different demographics or political leaning of news media we can see a pattern.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Leistungskategorie untersuchen, das heißt, wenn wir die Leistung trennen. Unterschiedliche Demografien oder politische Nachrichtenmedien können zeigen, dass beispielsweise Sprachdetektionsmodelle bessere", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_447.wav", "doc_id": "hgIDlKNiFM.seg_447", "src_text": "In addition to this comparison, we introduced three models trained on continual pre-training to analyze the impact of pre-training strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zusätzlich zu diesem Vergleich führen wir drei Modelle des kontinuierlichen Trainings ein, um die Auswirkungen der Trainingsstrategie zu analysieren.", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_628.wav", "doc_id": "oeooqChmKK.seg_628", "src_text": "However, with task-specific training, some models successfully integrate knowledge from multiple sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Mit einer taskspezifischen Schulung integrieren jedoch einige Modelle erfolgreich Wissen aus mehreren Quellen. Trotzdem", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_487.wav", "doc_id": "SUkmfOTvGi.seg_487", "src_text": "For data overfitting, we saw that from the graph on the right, the red best fit line has a gradient that is greater than one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Bei der Anpassung der Überpassung haben wir festgestellt, dass die rote Bestpassungslinie auf der rechten Seite eine Steigung von mehr als eins hat.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_218.wav", "doc_id": "oYCKgTzTDy.seg_218", "src_text": "And Cross-Lingual Semantic Parsing is the task to translate queries in multiple natural languages into multiple meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Aufgabe besteht darin, Aussagen in mehreren natürlichen Sprachen in mehrere Bedeutungsrepräsentationen zu übersetzen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_787.wav", "doc_id": "WTTtiRKFZI.seg_787", "src_text": "The argument is based on the principle of dependency length minimization that I will explain on the basis of these examples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ok, das Argument stützt sich auf das Prinzip der Abhängigkeitslängenminimierung, das wir auf der Grundlage dieser Beispiele erklären. So", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_310.wav", "doc_id": "dJGfOSFgZO.seg_310", "src_text": "And today we'll tell you all about ABC-Eval, a new dimensional approach to evaluating conversational AI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und heute werden wir Ihnen alles über ABC-Eval erzählen, einen neuen dimensionalen Ansatz zur Bewertung der konversationalen AI.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_41.wav", "doc_id": "aQpIWggfCo.seg_41", "src_text": "In summary, we establish the constrained language planning problem.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zusammenfassend können wir sagen, dass wir das Problem der begrenzten Sprachplanung aufgestellt", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_507.wav", "doc_id": "dvGkKzmIaN.seg_507", "src_text": "For example, OpenAI offers a GPT based embedding API.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel bietet OpenLayers eine GPX-basierte Einbettungs-API. Jedoch haben jüngste", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_783.wav", "doc_id": "WTTtiRKFZI.seg_783", "src_text": "So we get dependencies from the governor.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "bekommen wir Abhängigkeiten von dem", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_124.wav", "doc_id": "wLqFAuDnKa.seg_124", "src_text": "PaLM is a 540 billion-parameter large language model presented last year in 2022.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "FARM ist ein 500-Milliarden-Parameter-Modell der großen Sprache, das im Jahr 2022 vorgestellt wurde.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_134.wav", "doc_id": "wLqFAuDnKa.seg_134", "src_text": "The majority of sentences 516 out of 1,000.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Mehrheit der Sätze, sechzehn von tausend,", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_191.wav", "doc_id": "SLpqvupgvW.seg_191", "src_text": "The second one is when the entities have similar titles, for example, two books with the name \"The Return\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die zweite ist, wenn die Entitäten ähnliche Titel haben, zum Beispiel zwei Bücher mit dem Namen „Der Retter“", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_34.wav", "doc_id": "aQpIWggfCo.seg_34", "src_text": "We appy our method for building a dataset of constrained language planning, named as CoScript.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir planen unsere Methode für den Aufbau eines Datensatzes für die kontrollierte Sprachplanung, genannt „Codescript“.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_613.wav", "doc_id": "oeooqChmKK.seg_613", "src_text": "Second, there's a \"Background-Both\" setting, where background knowledge is available both at pretrain time and inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zweitens gibt es eine Hintergrundbeleuchtung, wobei die Hintergrundwissen sowohl zu Trainingszeiten als auch zu Interferenzzeiten verfügbar sind;", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_429.wav", "doc_id": "WBLMIsdIrq.seg_429", "src_text": "Thank you so much for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vielen Dank für Ihre Aufmerksamkeit,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_365.wav", "doc_id": "gGbuDbHhyc.seg_365", "src_text": "But that's not the end of the story, because if we either way decide to access clean samples, then training on them directly will even achieve better performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Aber das ist nicht das Ende der Geschichte, denn wenn wir uns entscheiden, saubere Proben zu verwenden, wird das Training damit sogar noch bessere Ergebnisse erzielen.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_801.wav", "doc_id": "WTTtiRKFZI.seg_801", "src_text": "So here we have a dependency from \"read\" to the adjunct of length 7 measured in words and from \"read\" to \"book\" of length 4, so together it's 11.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "haben wir eine Abhängigkeit von red zu dem Ende von length sieben gemessen in Wörtern und von red zu book von length vier, also zusammen elf, wenn", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_600.wav", "doc_id": "oeooqChmKK.seg_600", "src_text": "Here is an example from our data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ein Beispiel aus unserem Datensatz:", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_37.wav", "doc_id": "aQpIWggfCo.seg_37", "src_text": "This figure shows the constraint distribution of CoScript.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese Abbildung zeigt eine eingeschränkte Verteilung von CoScript.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_384.wav", "doc_id": "WBLMIsdIrq.seg_384", "src_text": "A Data-driven, Multilingual Exploration\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "- a data-driven multilingual exploration' vorstellen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_19.wav", "doc_id": "aQpIWggfCo.seg_19", "src_text": "The heat map in the figure shows that the planning performance of InstructGPTs varies considerably for goals of different categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die Karteikarte zeigt, dass die Planungsleistung von Unterrichtseinrichtungen für Mädchen unterschiedlicher Kategorien beträchtlich variiert.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_204.wav", "doc_id": "SLpqvupgvW.seg_204", "src_text": "For example, \"the one without words\", \"not the one with the 12 year old boy\", or \"the fictional one\", or \"comes from Azerbaijan\", and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "zum Beispiel die ohne Worte, nicht der mit dem zwölfjährigen Jungen, oder die fiktive, die aus Aserbaidschan kommt und so weiter.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_425.wav", "doc_id": "WBLMIsdIrq.seg_425", "src_text": "But these models are not much better than models that do not use context on other phenomena like ellipsis, pronouns, and verb form.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "diese Modelle sind nicht viel besser als die Modelle, die keine Kontexte auf anderen", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_280.wav", "doc_id": "PIZEXUFLAR.seg_280", "src_text": "So for the training dataset, we use 53 tasks from 9 groups for training and we sample 10,000 instances per task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Für das Trainingsdatenset verwenden wir fünfunddreißig Aufgaben aus der N-Gruppe für das Training und beispielhaft zehntausend Instanzen", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_283.wav", "doc_id": "PIZEXUFLAR.seg_283", "src_text": "In addition, we randomly sample 20 tasks from the test split of natural instructions as an unseen task for NLP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "jede Aufgabe; zudem nehmen wir zufällig eine Aufgabe aus dem Testset der natürlichen Anweisung als unsichtbare Aufgabe für den NLP.", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_528.wav", "doc_id": "dvGkKzmIaN.seg_528", "src_text": "The weight of the target embedding is proportional to the number of triggers in the sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das Gewicht des Ziel-Embeddings ist proportional zur Anzahl der Trigger im Satz.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_866.wav", "doc_id": "GvEBWkLmuI.seg_866", "src_text": "So for example, the words describing Latina women include things like \"vibrant\" and \"curvaceous\" which connect to a trope of tropicalism.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "der Farbe, wie zum Beispiel die lateinische Frau, die lebendig und lebhaft ist.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_222.wav", "doc_id": "oYCKgTzTDy.seg_222", "src_text": "But Chinese is missing and lack of coverage on certain meaning representation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Usbekisch, Kirgisisch, Mongolisch, Tibetisch, Nepali, Bengali, Marathi, Gujarati, Telugu, Tamil,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_490.wav", "doc_id": "SUkmfOTvGi.seg_490", "src_text": "So what about temporal drift then?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Was ist mit der Temperaturdiffusion? Für", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_713.wav", "doc_id": "oaOHnMCwad.seg_713", "src_text": "So for GPT 4, in the social acceptability task, we find that it's most aligned to people with a college education or Graduate School education and we find the same for Dynahate where it's most aligned to people with a college education.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "eine Hochschulausbildung haben, so dass für die Aufgabe der sozialen Eingängigkeit vier GBD finden, dass es die meisten Verbindungen mit Personen mit einer Hochschulausbildung oder einer Abschluss-Hochschulausbildung gibt. Und wir finden das Gleiche für Donny Haide, wo es den Menschen mit einer Hochschulbildung am meisten zusagt.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_647.wav", "doc_id": "FLkGnzVRew.seg_647", "src_text": "Tweets were passed using the PDTB parser, and pairs of discourse units were annotated according to the guidelines that are described in our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Tweets wurden unter Verwendung eines PTB-Parsers und Paare von Diskurs-Einheiten gemäß den Richtlinien, die im Papier beschrieben sind, analysiert.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_297.wav", "doc_id": "PIZEXUFLAR.seg_297", "src_text": "So we also did one experiment.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "So führen wir auch ein Experiment durch,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_119.wav", "doc_id": "uZBWfYjYnf.seg_119", "src_text": "If you want to discover more results, read our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn Sie mehr Ergebnisse entdecken möchten, lesen Sie unser Papier,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_710.wav", "doc_id": "oaOHnMCwad.seg_710", "src_text": "So for the GPT 4 social acceptability analysis, we find that it's most aligned to confucian and English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "dass für die Analyse der sozialen Akzeptabilität der GPD 4 festgestellt wird, dass es am meisten mit Konflikt und englischsprachigen Ländern übereinstimmt, und", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_804.wav", "doc_id": "WTTtiRKFZI.seg_804", "src_text": "That's why this sounds quite okay.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "klingt das in Ordnung.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_630.wav", "doc_id": "oeooqChmKK.seg_630", "src_text": "If you're interested in more details, please see our paper and check out the data set and code on GitHub.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn Sie mehr Einzelheiten erfahren möchten, sehen Sie sich bitte unser Paper und den Datensatz auf GitHub an.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_870.wav", "doc_id": "GvEBWkLmuI.seg_870", "src_text": "And while it sounds positive at first glance, there's been work showing that this kind of archetype actually is very harmful because it puts a lot of pressure on these demographics to be resilient and strong against societal obstacles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und klingt auf den ersten Blick positiv. Es hat sich gezeigt, dass diese Art von Archetypus eigentlich sehr schädlich ist, weil sie viel Druck auf diese Demografien ausübt, um widerstandsfähig und stark gegen soziale Hindernisse zu sein.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_95.wav", "doc_id": "uZBWfYjYnf.seg_95", "src_text": "And what are the problems of the current SimulST models?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und was sind die Probleme der aktuellen SimulS-Modelle?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_588.wav", "doc_id": "rISrKoXQCx.seg_588", "src_text": "Thank you for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dank für Ihre Zeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_693.wav", "doc_id": "oaOHnMCwad.seg_693", "src_text": "Our framework works in two main steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unser Rahmenwerk funktioniert in zwei Hauptstufen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_703.wav", "doc_id": "oaOHnMCwad.seg_703", "src_text": "We've then compared these, annotations with Social Chemistry, Delphi and GPT 4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir vergleichen dann diese Anmerkungen mit Social Chemistry, Delphi und GPT-4. Wir wiederholen", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_491.wav", "doc_id": "SUkmfOTvGi.seg_491", "src_text": "For temporal drift, we did an experiment to retrain or continue to pre-train some models with more recent data and we found that the performance degrades with larger temporal gap and this confirms our hypothesis that the main cause of the performance drop is temporal drift.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die zeitliche Drift haben wir ein Experiment durchgeführt, um einige Modelle mit neueren Daten neu zu trainieren oder weiter zu pre-trainieren, und wir haben festgestellt, dass die Leistung mit größeren zeitlichen Lücken abnimmt. Dies bestätigt unsere Hypothese, dass die Hauptursache für den Leistungsabfall die zeitliche Drift ist.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_175.wav", "doc_id": "SLpqvupgvW.seg_175", "src_text": "Our data set collection methodology emphasizes informality using a cartoon completion setup.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unsere Datensatzsammlungsmethode betont die Informalität mit einem Cartoon-Completion-Setup.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_764.wav", "doc_id": "XejEJmgUmE.seg_764", "src_text": "And after doing like several of these perturbations, we find that none of these noises are actually making the model like change its course in terms of how it shows us the MPP judgement print.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "„Lärm“ zu versehen, und nachdem wir viele dieser Störungen durchgeführt hatten Wir stellen fest, dass keiner dieser Geräusche tatsächlich den Modellverlauf in Bezug auf die Art und Weise, wie es", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_213.wav", "doc_id": "SLpqvupgvW.seg_213", "src_text": "Here is a link to our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier ist ein Link zu unseren Datensätzen.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_290.wav", "doc_id": "PIZEXUFLAR.seg_290", "src_text": "If it's a multi-modal generation task, we report Rouge-L. For NLP task, we report Rouge-L as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "es sich um eine multimodale Generierungsaufgabe handelt, berichten wir über RUGL, und für NPR-Aufgaben berichten wir auch über RUGL.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_392.wav", "doc_id": "WBLMIsdIrq.seg_392", "src_text": "Firstly because only a small portion of translations depend on context which makes corpus-level metrics like BLEU unable to capture these translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist jedoch ziemlich schwer: Erstens, weil nur ein kleiner Teil der Übersetzungen vom Kontext abhängt, was Korpus-Ebenen-Metriken wie Blue daran hindert, diese Übersetzungen zu erfassen. Und", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_423.wav", "doc_id": "WBLMIsdIrq.seg_423", "src_text": "This again demonstrates that it is difficult to determine the best document-level translation system if we use corpus-level metrics alone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies zeigt, dass es schwierig ist, das beste Dokumentenverarbeitungssystem zu ermitteln, wenn man die Korpus-Metrik verwendet. Jetzt", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_723.wav", "doc_id": "oaOHnMCwad.seg_723", "src_text": "I mean, we want to emphasise that inclusive NLP isn't just making.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ist die Maska-Initiative. Ich möchte betonen, dass inklusives P nicht nur für alle", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_811.wav", "doc_id": "WTTtiRKFZI.seg_811", "src_text": "So when the difference between the lengths of the two conjuncts grows, the shorter conjunct prefers to be the first one, stronger, right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "also der Unterschied zwischen den Längen der beiden Konjunktionen groß ist, dann ist der kürzere Konjunkt die erste stärkere, also ist", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_193.wav", "doc_id": "SLpqvupgvW.seg_193", "src_text": "And finally when they have similar info boxes or attributes on Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und schließlich, wenn sie ähnliche Infoboxen oder Attribute auf Wikipedia haben,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_142.wav", "doc_id": "wLqFAuDnKa.seg_142", "src_text": "And when we go, as in our case, to five-shot prompting, there is nearly no difference to the actual form of the prompting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wenn wir in unserem Fall zum Schießen gehen, gibt es fast keine Unterschiede zur tatsächlichen Form des Schießens. Es", "score": 29.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_802.wav", "doc_id": "WTTtiRKFZI.seg_802", "src_text": "When you swap these two constituents, the sum of these two dependencies becomes 6.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sie diese beiden Konstituenten verschieben, dann werden einige dieser beiden Abhängigkeiten zu", "score": 58.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_331.wav", "doc_id": "dJGfOSFgZO.seg_331", "src_text": "On the other hand, the combination of all turn-level Likert metrics explains far less of the quality, and fewer of these metrics carry unique information.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "der anderen Seite erklärt die Kombination alternativer Lickert-Metriken viel weniger der Qualität und weniger dieser Metriken tragen einzigartige Informationen.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_849.wav", "doc_id": "GvEBWkLmuI.seg_849", "src_text": "So for instance, the word \"warrior\" is usually associated with men.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "So ist das Wort „Mann“ oder „Krieger“ normalerweise mit „Mann“ verbunden,", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_598.wav", "doc_id": "oeooqChmKK.seg_598", "src_text": "We introduce a coreference resolution task, designed to probe for the ability to draw on knowledge available in different sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir führen eine Korrelationsanalyse durch, die darauf ausgelegt ist, die Fähigkeit zu bewerten, auf Wissen in verschiedenen Quellen zuzugreifen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_865.wav", "doc_id": "GvEBWkLmuI.seg_865", "src_text": "Furthermore, there's a lot of common tropes that are reflected in these words, especially for women of color.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und noch mehr sind die vielen Gemeinsamkeiten, die in diesen Wörtern enthalten sind, insbesondere für eine", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_494.wav", "doc_id": "SUkmfOTvGi.seg_494", "src_text": "At the same time, we also found that the performance drop here is caused by temporal drift and kind of surprisingly, it is not caused by adaptive overfitting even though CoNLL-2003 has been used for over 20 years.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Gleichzeitig stellten wir auch fest, dass der Leistungsverlust hier durch zeitliche Schwankungen verursacht wird, und überraschenderweise ist er nicht durch adaptiven Überdrehungskoeffizienten verursacht, obwohl der Cornal 2003 seit über zwanzig Jahren verwendet wird.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_564.wav", "doc_id": "rISrKoXQCx.seg_564", "src_text": "For example, for RoBERTa further trained on the left-leaning Reddit corpus we can see a substantial liberal shift in terms of its political biases.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zum Beispiel können wir für Roberta, die eine weiter gefällige Stimme hat und weiter auf dem linken Korpus trainiert ist, einen wesentlichen liberalen Wandel in ihrer Stimme sehen. In Bezug auf seine politischen Überzeugungen.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_294.wav", "doc_id": "PIZEXUFLAR.seg_294", "src_text": "As we can see, instruction tuning can significantly improve OFA's performance on seen multi-modal tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wie wir sehen können, dass die Anpassung der Anweisungen die Leistung von OIS auf ähnlichen Multimodal-Aufgaben erheblich verbessern kann.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_779.wav", "doc_id": "WTTtiRKFZI.seg_779", "src_text": "Now those are asymmetric approaches to coordinate structures, such as the Prague approach.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "symmetrischen Ansätze für Koordinatenstrukturen, wie z. B. der", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_405.wav", "doc_id": "WBLMIsdIrq.seg_405", "src_text": "First, we look at part-of-speech tags that have high mean P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zunächst sehen wir uns die Sprachetiketten an, die hohe Minen haben, wie z.B.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_165.wav", "doc_id": "SLpqvupgvW.seg_165", "src_text": "Here, a user wants to select between one of these two songs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier möchte ein Benutzer zwischen zwei dieser beiden Lieder wählen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_418.wav", "doc_id": "WBLMIsdIrq.seg_418", "src_text": "We then use the MuDA tagger, by applying the tagger on a parallel corpus that we want to use for evaluation and we apply our translation metrics of choice on the context-dependent examples that the MuDA tagger has identified.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "haben. Dann verwenden wir den Muta-Tagger, indem wir den Tagger auf das parallele Korpus anwenden, das wir für die Bewertung verwenden möchten. Und wir wenden unsere Übersetzungsmetriken zur Auswahl auf die kontextabhängigen Beispiele an, die der Modus-Tagger identifiziert hat.", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_77.wav", "doc_id": "TVCREhgqUP.seg_77", "src_text": "Then we jump to the next multiset token, to determine the second token in the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann springen wir zum nächsten Multiset-Token, um den zweiten Token im Output zu bestimmen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_777.wav", "doc_id": "WTTtiRKFZI.seg_777", "src_text": "Right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_122.wav", "doc_id": "wLqFAuDnKa.seg_122", "src_text": "Hello everyone, my name is David Vilar, and I will be giving a short review of the paper \"Prompting PaLM for Translation: Assessing Strategies and Performance.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo, mein Name ist Said, und ich werde eine kurze Übersicht über das Papier „Translation, Assessing Strategies and Performance“ geben.", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_869.wav", "doc_id": "GvEBWkLmuI.seg_869", "src_text": "This connects to an archetype that people have called the \"Strong Black Women\" archetype.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies verbindet sich mit einem Archetyp, den die Menschen als den starken schwarzen Archetyp bezeichnet haben,", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_113.wav", "doc_id": "uZBWfYjYnf.seg_113", "src_text": "But also we want that they are shifted on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Aber auch wir wollen, dass sie auf die linke Seite versetzt werden", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_755.wav", "doc_id": "XejEJmgUmE.seg_755", "src_text": "We increase the context length toward up to 1024 for to max out OPT and GPT 2 models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir verlängerten den Kontext auf bis zu 10.000, um die OPT- und GPT-2-Modelle zu maximieren, und", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_360.wav", "doc_id": "gGbuDbHhyc.seg_360", "src_text": "Otherwise, there is a large performance drop.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ansonsten gibt es einen großen Leistungsabfall,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_327.wav", "doc_id": "dJGfOSFgZO.seg_327", "src_text": "In addition, ABC-Eval labels are more predictive of the overall conversation quality compared to metrics produced by existing methods, as shown by this simple linear regression analysis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Darüber hinaus sind ABC-EV-Labels im Hinblick auf die Gesprächsqualität der Gesamtkommunikation vorhersagbarer als Metriken, die von existierenden Methoden erzeugt werden, wie es durch diese einfachen linearen Regressionen gezeigt", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_352.wav", "doc_id": "gGbuDbHhyc.seg_352", "src_text": "We can't stop on this problem setting, but this implies that additional manual annotations are required in weakly supervised learning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Validierungsset gibt. Wir haben Zweifel an dieser Problemstellung, aber das impliziert, dass zusätzliche manuelle Anmerkungen beim Erlernen von Wikis erforderlich sind,", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_167.wav", "doc_id": "SLpqvupgvW.seg_167", "src_text": "But sometimes an indirect reference is more appropriate to have a more natural conversation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "erste. Aber manchmal ist ein indirekter Hinweis angemessener, um eine natürlichere Unterhaltung zu haben;", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_316.wav", "doc_id": "dJGfOSFgZO.seg_316", "src_text": "One approach is to simply ask human judges to evaluate several dimensions of dialogue quality, such as the relevance of model responses using existing comparative or Likert scale methods.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Eine Möglichkeit besteht darin, einfach Menschen zu bitten, mehrere Dimensionen der Dialogqualität zu bewerten, wie die Relevanz der Modellantworten, mithilfe bestehender vergleichender oder Likert-Skala-Methoden.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_132.wav", "doc_id": "wLqFAuDnKa.seg_132", "src_text": "Finally, we provide some recommendations for prompt selection strategies.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "geben wir einige Empfehlungen für prompte Auswahlstrategien. Die Stimulierung", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_844.wav", "doc_id": "GvEBWkLmuI.seg_844", "src_text": "Our prompts to generate these personas were inspired by a study where they gave these prompts to human subjects, finding that by giving it to human subjects, they also were able to surface racial stereotypes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unsere Proben, die diese Persönlichkeiten hervorbrachten, wurden von einer Studie inspiriert, in der diese Proben an menschlichen Subjekten getestet wurden, wobei festgestellt wurde, dass sie auch rassenspezifische Stereotypen aufweisen. Und", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_736.wav", "doc_id": "XejEJmgUmE.seg_736", "src_text": "And then the hope is that the model, basically, puts more probability to the acceptable sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dann hofft man, dass das Modell im Wesentlichen mehr Wahrscheinlichkeit für den akzeptablen Satz hat. Die derzeitige Pipeline", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_757.wav", "doc_id": "XejEJmgUmE.seg_757", "src_text": "Now, what happens when we choose sentences from the same data set?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was passiert nun, wenn wir Sätze aus dem gleichen Datensatz auswählen?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_145.wav", "doc_id": "wLqFAuDnKa.seg_145", "src_text": "So it's important to select the examples from high-quality translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist also wichtig, Beispiele aus hochwertigen Übersetzungen auszuwählen,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_304.wav", "doc_id": "PIZEXUFLAR.seg_304", "src_text": "We design a new metric called sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir entwerfen ein neues Maß, das Sensitivity genannt wird.", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_518.wav", "doc_id": "dvGkKzmIaN.seg_518", "src_text": "Therefore, in this paper we propose Embedding marker, which is a backdoor based watermark method applicable to embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "schlagen wir in dieser Arbeit eine Implementierungsmarker vor, die eine Backdoor-basierte Wasserzeichenmethode ist, die auf die Implementierung von ADS-Diensten anwendbar ist.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_589.wav", "doc_id": "oeooqChmKK.seg_589", "src_text": "Hello everyone, I'm Akshatha, and today my co-author Martin and I are presenting our work \"The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo alle, ich bin Amratha Mathew und ich präsentiere heute meine Arbeit, die KIT Masterclass: Evaluierung der Wissensintegration aus mehreren Quellen.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_9.wav", "doc_id": "aQpIWggfCo.seg_9", "src_text": "A good planner should write scripts that are reasonable and faithful to constraints.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ein guter Planer sollte Skripte schreiben, die vernünftig und den Einschränkungen treu sind.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_635.wav", "doc_id": "FLkGnzVRew.seg_635", "src_text": "Simply put, cognitive dissonance is two beliefs or actions that are inconsistent, such as this example where a person states, \"I know that cigarettes could kill me\", and then goes on to say \"I grabbed a couple of smokes after the meeting\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "studieren, ist kognitive Dissonanz einfach zwei Überzeugungen oder Handlungen, die inkonsistent sind. Dieses Beispiel zeigt, dass ich weiß, dass die Zigaretten mich umbringen würden, und dann sage ich, dass ich nach dem Treffen ein paar Rauchpausen einlegen würde.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_105.wav", "doc_id": "uZBWfYjYnf.seg_105", "src_text": "Our solution is to propose EDAtt, or Encoder-Decoder Attention, and it is a strategy for which we decide whether to emit or not a partial translation, based on where attention points to.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unsere Lösung besteht darin, einen EDA- oder einen Encoder für die Codierung der Aufmerksamkeit zu vorschlagen, und es ist eine Strategie, bei der wir entscheiden, ob wir eine partielle Übersetzung senden oder nicht, basierend auf der Position der Aufmerksamkeit.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_756.wav", "doc_id": "XejEJmgUmE.seg_756", "src_text": "And we saw here in the orange dotted line, the MPP judgments are relatively stable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wir sahen hier in der orangefarbenen Zeile, dass die MPP-Urteile relativ stabil sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_746.wav", "doc_id": "XejEJmgUmE.seg_746", "src_text": "So we can do the same thing by choosing unacceptable sentences from the same matching, and that could also be used to test the models acceptability.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "hinzufügen. Daher können wir dasselbe tun, indem wir unannehmbare Sätze aus demselben Matching auswählen, und das könnte auch verwendet werden, um die Akzeptanzfähigkeit der Modelle zu testen.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_804.wav", "doc_id": "WTTtiRKFZI.seg_804", "src_text": "That's why this sounds quite okay.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "klingt das ganz in", "score": 8.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_312.wav", "doc_id": "dJGfOSFgZO.seg_312", "src_text": "So let's say that you just developed a dialogue model and you want to see how well it compares against the current state-of-the-art.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Sie uns also sagen, dass Sie gerade ein Dialogmodell entwickelt haben und sehen möchten, wie gut es mit dem aktuellen Stand der Technik vergleichbar", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_626.wav", "doc_id": "oeooqChmKK.seg_626", "src_text": "Additional experiments with fictional knowledge indicated even the best performing models, cannot reliably integrate backward knowledge provided only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zusätzliche Experimente mit fiktivem Wissen zeigen, dass selbst die besten Modelle das Hintergrundwissen, das ihnen zur Verfügung gestellt wird, nicht zuverlässig integrieren können.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_193.wav", "doc_id": "SLpqvupgvW.seg_193", "src_text": "And finally when they have similar info boxes or attributes on Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und schließlich, wenn sie ähnliche Infoboxen oder Attribute auf Wikipedia haben,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_717.wav", "doc_id": "oaOHnMCwad.seg_717", "src_text": "So, given that there is positionality in NLP, what can we do about it?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Heat Task Analysen. Also, was können wir tun, wenn", "score": 56.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_292.wav", "doc_id": "PIZEXUFLAR.seg_292", "src_text": "So this measures the model's ability to consistently produce the same outputs for the same task regardless of the slight variation in the wording of the instruction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Fähigkeit des Modells misst, unabhängig von geringfügigen Abweichungen in der Anweisung immer die gleichen Ergebnisse für die gleiche Aufgabe zu liefern. Hier sind unsere Hauptergebnisse, wie man sieht, kann die", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_675.wav", "doc_id": "oaOHnMCwad.seg_675", "src_text": "I'm Jenny, a first year PhD student at Carnegie Mellon University and today I'll be presenting your work NLPositionality characterising design biases of datasets and Models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Jenny, Studentin im ersten Jahr an der Universität Karnegi-Mellon, und werde heute meine Arbeit und ihre Position beschreiben, indem ich Datenmodelle erstelle.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_543.wav", "doc_id": "dvGkKzmIaN.seg_543", "src_text": "That's all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das ist alles, danke.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_624.wav", "doc_id": "oeooqChmKK.seg_624", "src_text": "When trained on KITMUS, however, both C2F and BERT4Coref perform significantly better than the random choice.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "nicht gut. Das Training auf dem Kidmus hingegen ist bei beiden Modellen signifikant besser als die zufällige Wahl.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_95.wav", "doc_id": "uZBWfYjYnf.seg_95", "src_text": "And what are the problems of the current SimulST models?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sind die Probleme der aktuellen Simulink-Modelle?", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_215.wav", "doc_id": "oYCKgTzTDy.seg_215", "src_text": "Hello everyone, my name is Yusen Zhang from the Penn State University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo alle, mein Name ist Usain John von der Universität von Pennsylvania.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_67.wav", "doc_id": "TVCREhgqUP.seg_67", "src_text": "For the first time, we show strong generalization to deeper recursion without relying on trees.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum ersten Mal zeigen wir eine starke Generalisierung zu tiefer Rekursion ohne auf Bäume zurückgreifen zu müssen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_789.wav", "doc_id": "WTTtiRKFZI.seg_789", "src_text": "So \"Marge read it yesterday\" is fine because the direct object is close to the verb, while \"Marge read yesterday it\" is much worse.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "is close to the verb.“ Während es gestern Abend viel schlimmer war, ist es heute viel besser, weil", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_856.wav", "doc_id": "GvEBWkLmuI.seg_856", "src_text": "However, when we actually look at the distribution of the words and lexicon, we find very different things.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir verwenden. Wir finden jedoch sehr unterschiedliche Dinge, wenn wir uns die Verteilung der Wörter im Lexikon ansehen. Also haben", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_421.wav", "doc_id": "WBLMIsdIrq.seg_421", "src_text": "But then if we use COMET, context-aware models perform best.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "haben. Dann verwenden wir jedoch Comet, kontextbewusste Modelle, die am besten funktionieren,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_519.wav", "doc_id": "dvGkKzmIaN.seg_519", "src_text": "Then let me introduce the details of our embedding marker.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann lassen Sie mich die Einzelheiten unseres eingebetteten Markers erläutern.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_362.wav", "doc_id": "gGbuDbHhyc.seg_362", "src_text": "This indicates that WSL approaches actually require cleanly labeled data to work properly, and the annotation cost for obtaining clean validation samples should not be overlooked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies zeigt, dass WSL-Ansätze tatsächlich sauber gekennzeichnete Daten erfordern, um ordnungsgemäß zu funktionieren, und dass die Anmerkungskosten für die Erlangung sauberer Validierungsbeispiele nicht vernachlässigt werden sollten.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_361.wav", "doc_id": "gGbuDbHhyc.seg_361", "src_text": "As shown in this figure, if there are no clean validation samples, then the trained models cannot generalize beyond the original weak labels, meaning that the training is pointless.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wie in dieser Abbildung zu sehen ist. Wenn es keine sauberen Validierungsmuster gibt, können die Trendmodelle nicht über die ursprünglichen Bit-Etiketten hinaus generalisiert werden. Das bedeutet, dass diese Doktrin sinnlos ist.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_631.wav", "doc_id": "oeooqChmKK.seg_631", "src_text": "Thanks for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "für das Zuhören.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_751.wav", "doc_id": "XejEJmgUmE.seg_751", "src_text": "Finally, we can choose sentences from a completely unrelated domain such as Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Schließlich können wir Sätze aus einer völlig unabhängigen Domäne wie Wikipedia auswählen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_508.wav", "doc_id": "dvGkKzmIaN.seg_508", "src_text": "However, recent works have shown that the attacker may steal the model through learning from the embedding and provide similar services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Arbeiten gezeigt, dass der Angreifer das Modell durch das Lernen vom Embedding stehlen und ähnliche Dienste anbieten kann.", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_783.wav", "doc_id": "WTTtiRKFZI.seg_783", "src_text": "So we get dependencies from the governor.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "so dass wir von der Regierungsstruktur Abhängigkeiten erhalten, die", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_853.wav", "doc_id": "GvEBWkLmuI.seg_853", "src_text": "So for instance, for the personas of black women, we would do Fightin’ Words and compare the log-odds ratios against both white personas and man personas because those are the two corresponding unmarked groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zum Beispiel für die Persönlichkeiten der schwarzen Frauen werden wir Wörter sammeln und die Logod-Ratios für beide weißen und männlichen Persönlichkeiten vergleichen, weil es sich um zwei nicht markierte Gruppen handelt.", "score": 53.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_277.wav", "doc_id": "PIZEXUFLAR.seg_277", "src_text": "We follow the method from OFA and formulate all the tasks in a unified sequence-to-sequence format.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "folgen wir der Masterform von OFA und formulieren alle Aufgaben in einer vereinheitlichten Sequenz-zu-Sequenz-Format.", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_701.wav", "doc_id": "oaOHnMCwad.seg_701", "src_text": "We host 2 tasks on lab in the wild, one of them being social acceptability, and the way this works is that participants will read a situation from the social chemistry dataset and, then they'll write how socially acceptable a situation is.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und die Art und Weise, wie diese Arbeit ist, dass die Teilnehmer eine Situation aus der Sozialchemie erhalten und wie sozialverträglich diese Situation ist.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_546.wav", "doc_id": "rISrKoXQCx.seg_546", "src_text": "Hi, I'm Shangbin, PhD student in the University of Washington.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hallo, ich bin John Bin, Doktorand an der University of Washington.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_255.wav", "doc_id": "oYCKgTzTDy.seg_255", "src_text": "For example, Encoder-Decoder outperforms previous work or achieves comparable results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "zum Beispiel Encoder-Decoder-Modelle, die vergleichbare Ergebnisse", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_282.wav", "doc_id": "PIZEXUFLAR.seg_282", "src_text": "We use all the instances in the test split for each task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Prüfung aus. Wir verwenden alle Instanzen im Test für", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_747.wav", "doc_id": "XejEJmgUmE.seg_747", "src_text": "And we can also do the same by choosing sentences from a different subset or a different data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir können dasselbe auch erreichen, indem wir Sätze aus einem anderen Satz oder einem anderen Datensatz auswählen.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_52.wav", "doc_id": "TVCREhgqUP.seg_52", "src_text": "As usual, we have a training set of utterances.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "aus, als hätten Sie in diesem Fall ein", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_100.wav", "doc_id": "uZBWfYjYnf.seg_100", "src_text": "So what is our solution?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Was ist unsere Lösung?", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_844.wav", "doc_id": "GvEBWkLmuI.seg_844", "src_text": "Our prompts to generate these personas were inspired by a study where they gave these prompts to human subjects, finding that by giving it to human subjects, they also were able to surface racial stereotypes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unsere Anfragen zur Erzeugung dieser Personen wurden von einer Studie inspiriert, in der sie diesen Anfragen menschliche Teilnehmer gaben und fanden, dass sie auch Rassenstereotypen aufdecken konnten.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_246.wav", "doc_id": "oYCKgTzTDy.seg_246", "src_text": "We found that Encoder-Decoder or Encoder-PTR can be improved by training in a mixture of various languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir stellten fest, dass Encoder-Decoder oder Encoder-PDR durch Schulung in einer Mischung verschiedener Sprachen verbessert werden können.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_667.wav", "doc_id": "FLkGnzVRew.seg_667", "src_text": "We find that PRC has the highest percentage of dissonance and works best for rare class.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir stellen fest, dass die Anmerkung mit dem höchsten Prozentsatz von Anmerkungen und der besten Anmerkung für die Klasse ist.", "score": 27.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_515.wav", "doc_id": "dvGkKzmIaN.seg_515", "src_text": "Finally, the watermark needs to be transferable to the attacker's services during the model extraction process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "schließlich muss das Wasserzeichen während des Modellextraktionsprozesses auf die Services des Angreifers übertragbar sein.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_182.wav", "doc_id": "SLpqvupgvW.seg_182", "src_text": "We provide the first and second speech bubbles automatically, but the third one is filled in by the annotator.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir stellen automatisch die ersten beiden Sprachbubbles zur Verfügung, aber der dritte wird vom Kommentator gefüllt.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_859.wav", "doc_id": "GvEBWkLmuI.seg_859", "src_text": "And in fact, this lexicon doesn't really capture many of the harmful patterns that we saw in the earlier slides well at all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und in der Tat hat der Lesekontext nicht wirklich viele der schädlichen Muster erfasst, die wir in den früheren Zeilen gesehen haben,", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_877.wav", "doc_id": "GvEBWkLmuI.seg_877", "src_text": "We just really can't make any assumptions or really study that further, without more transparency.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir konnten wirklich keine Annahmen treffen, und wir untersuchten das weiter mit mehr Transparenz.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_582.wav", "doc_id": "rISrKoXQCx.seg_582", "src_text": "So if we do not sanitize political opinions in language model training data, the bias would propagate from pretraining data to language models to downstream tasks, ultimately creating fairness issues.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn wir die politischen Meinungen in den Sprachtrainingsdaten nicht standardisieren, würde sich die Vorliebe auf die vorherigen Sprachmodelle bis hinunter zu den Aufgaben erstrecken, was letztlich zu Fairnessfragen führen würde.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_462.wav", "doc_id": "hgIDlKNiFM.seg_462", "src_text": "So thank you for this presentation, and we are looking forward to exchange at the poster session in Toronto.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dank für diese Präsentation, und wir freuen uns auf Aktionen bei der Post in Toronto.", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_192.wav", "doc_id": "SLpqvupgvW.seg_192", "src_text": "The third one is when they have similar descriptions on Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und die dritte Methode ist die gleichartige Zufallsauswahl, z.B. zwei Bücher", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_171.wav", "doc_id": "SLpqvupgvW.seg_171", "src_text": "Here are some examples of indirect references for example, \"the newer one\" or \"the song that's not energetic.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "hier sind einige Beispiele für direkte Präferenzen, z. B. der neueste oder der nicht energiereiche.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_134.wav", "doc_id": "wLqFAuDnKa.seg_134", "src_text": "The majority of sentences 516 out of 1,000.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Mehrheit der Sätze – 516 von 1000", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_4.wav", "doc_id": "aQpIWggfCo.seg_4", "src_text": "And show that large language models can effectively decompose goals into steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und zeigen, dass große Sprachmodelle Ziele effektiv in Schritte zerlegen können.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_677.wav", "doc_id": "oaOHnMCwad.seg_677", "src_text": "So let's start off by imagining that you're working for a newspaper and you're sifting through comments under your news article trying to remove toxic content.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Lassen Sie uns also davon ausgehen, dass Sie für eine Zeitung arbeiten und versuchen Sie, den Inhalt zu entfernen.", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_629.wav", "doc_id": "oeooqChmKK.seg_629", "src_text": "Still, even the best-performing models seem to have difficulties with reliably integrating backward knowledge presented only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Stille. Auch die besten Modelle scheinen Schwierigkeiten mit zuverlässig integriertem rückwärts gerichteten Wissen zu haben, das nur bei der Inferenzzeit vorgestellt wird.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_424.wav", "doc_id": "WBLMIsdIrq.seg_424", "src_text": "Now, we use the MuDA benchmark to evaluate models and we find that context-aware models are significantly more accurate than models that do not use context for certain discourse phenomena such as formality and lexical cohesion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "verwenden wir den Munda-Benchmark, um Modelle zu bewerten, und wir stellen fest, dass Kontextmodellen signifikant genau sind als Modelle, die Kontext nicht für", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_578.wav", "doc_id": "rISrKoXQCx.seg_578", "src_text": "So this has sound the alarm for us to acknowledge and tackle the fairness issues resulting by language model political leanings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sich greifen könnte. Diese klingen also wie eine Warnung für Sie, um die Fairness-Probleme zu erkennen und anzugehen,", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_54.wav", "doc_id": "TVCREhgqUP.seg_54", "src_text": "And \"Mary knew that the girl slept.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ein neues Trainingsprogramm für die Mädchen. Dies", "score": 2.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_351.wav", "doc_id": "gGbuDbHhyc.seg_351", "src_text": "Technically, this claim is not wrong, but there's a catch, which is that people do assume that there's an additional clean validation set available for model selection.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Tatsächlich ist diese Behauptung nicht falsch, aber es gibt einen Haken. Denn die Leute gehen davon aus, dass es für die Modellauswahl ein zusätzliches sauberes", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_30.wav", "doc_id": "aQpIWggfCo.seg_30", "src_text": "Since large language models are costly to deploy, it's essential to enable language planning ability of smaller and specialized models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Da große Sprachmodelle teuer zu implementieren sind, ist es unerlässlich, Sprachplanung für kleinere und spezialisierte Modelle zu ermöglichen.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_220.wav", "doc_id": "oYCKgTzTDy.seg_220", "src_text": "Existing cross-lingual semantic parsing models are separately proposed and evaluated on data set of limited tasks and applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Bestehende Cross-Lingual-Semantic-Parsing-Modelle werden separat auf Datensätzen mit begrenzten Aufgaben und Anwendungen vorgeschlagen", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_254.wav", "doc_id": "oYCKgTzTDy.seg_254", "src_text": "We also find some other interesting findings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir werden auch einige andere interessante Erkenntnisse", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_677.wav", "doc_id": "oaOHnMCwad.seg_677", "src_text": "So let's start off by imagining that you're working for a newspaper and you're sifting through comments under your news article trying to remove toxic content.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "durchgeführt. Jetzt ist klar, dass Sie für eine Zeitung arbeiten und Kommentare und Artikel schreiben und versuchen, den Inhalt zu entfernen.", "score": 26.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_319.wav", "doc_id": "dJGfOSFgZO.seg_319", "src_text": "We call this approach annotating behaviors in chat or ABC-Eval in short.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir nennen diesen Ansatz „Annotieren von Verhaltensweisen in Chats“ oder „ABC-Eval in Kürze“.", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_101.wav", "doc_id": "uZBWfYjYnf.seg_101", "src_text": "First, to use already existing offline ST models without re-training or adopting specific architecture for SimulST.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Erstens, verwenden Sie bereits vorhandene Online-SD-Modelle ohne Wiedertrainieren oder die Anpassung spezifischer Architekturen für CivilSD; verwenden", "score": 58.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_416.wav", "doc_id": "WBLMIsdIrq.seg_416", "src_text": "And we called our tagger the Multilingual Discourse-Aware, or MuDA tagger.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "nennen unseren Tigrer den multilingualen Diskurs bewusst oder umuda Tigrer.", "score": 48.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_341.wav", "doc_id": "gGbuDbHhyc.seg_341", "src_text": "Hello, I am Dawei, a PhD student at Saarland University in Germany.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo, ich bin Dawid, ein Doktorand an der Universität Salent in Deutschland.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_241.wav", "doc_id": "oYCKgTzTDy.seg_241", "src_text": "And we also find many interesting results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellen auch viele interessante Ergebnisse", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_29.wav", "doc_id": "aQpIWggfCo.seg_29", "src_text": "Our method greatly improves the planning ability both in semantic completeness and faithfulness to the constraint.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unsere Methode verbessert die Anfälligkeit sowohl in semantischer Vollständigkeit als auch in Treue zu den Einschränkungen.", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_489.wav", "doc_id": "SUkmfOTvGi.seg_489", "src_text": "And this shows us that adaptive overfitting in this case is not observed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und das zeigt uns, dass adaptive Überanpassung in diesem Fall nicht beobachtet wird.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_509.wav", "doc_id": "dvGkKzmIaN.seg_509", "src_text": "Therefore, it's necessary to protect the copyright of embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Daher ist es notwendig, das Urheberrecht von Embedding- und Services zu schützen.", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_327.wav", "doc_id": "dJGfOSFgZO.seg_327", "src_text": "In addition, ABC-Eval labels are more predictive of the overall conversation quality compared to metrics produced by existing methods, as shown by this simple linear regression analysis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "zu Metriken, die durch existierende Methoden erzeugt werden, wie durch die einfache Regressionsanalyse. Beispielsweise können Sie sehen, wie die Proportionen der Drehungen mit", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_783.wav", "doc_id": "WTTtiRKFZI.seg_783", "src_text": "So we get dependencies from the governor.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Gouverneurs hier die Abhängigkeiten von allen", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_373.wav", "doc_id": "gGbuDbHhyc.seg_373", "src_text": "Their performance gain and practicality are heavily overestimated.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wobei Leistungszuwächse und Praktikabilität stark überschätzt werden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_467.wav", "doc_id": "SUkmfOTvGi.seg_467", "src_text": "We observe that models have been used in CoNLL-2003 to develop NER for almost 20 years and this naturally raises several problems.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir beobachten, dass Modelle seit 2003 Kornell verwendet haben, um NER zu entwickeln. Das wirft natürlich einige Probleme auf.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_712.wav", "doc_id": "oaOHnMCwad.seg_712", "src_text": "We also find most additional alignment with people who have a college education.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir finden auch die meisten zusätzlichen Zuordnungen zu Personen mit Hochschulbildung, daher finden", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_35.wav", "doc_id": "aQpIWggfCo.seg_35", "src_text": "In total, we generate 55,000 specific goals with scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Insgesamt generieren wir fünfzigtausend spezifische Ziele mit Skripten,", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_396.wav", "doc_id": "WBLMIsdIrq.seg_396", "src_text": "And second, how well do models handle these cases?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und zweitens, wie gut können die Modelle diese Fälle handhaben.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_420.wav", "doc_id": "WBLMIsdIrq.seg_420", "src_text": "First of all, when we use corpus-level metrics: so for BLEU, we find that context-agnostic models have the best performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Erstens, wenn wir Korpus-Level-Metriken verwenden, sehen wir, dass die komplexen agnostischen Modelle die beste Leistung", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_372.wav", "doc_id": "gGbuDbHhyc.seg_372", "src_text": "To summarize, we showed that recent WSL approaches require clean, manually annotated samples for them to work properly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zusammenfassend lässt sich sagen, dass aktuelle WSL-Ansätze saubere, manuell annotierte Proben benötigen, um richtig zu funktionieren,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_358.wav", "doc_id": "gGbuDbHhyc.seg_358", "src_text": "We addressed these research questions in our work and our findings are as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir gehen in unserer Arbeit auf diese Forschungsfragen ein, und unsere Ergebnisse sind wie folgt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_756.wav", "doc_id": "XejEJmgUmE.seg_756", "src_text": "And we saw here in the orange dotted line, the MPP judgments are relatively stable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wir sahen hier in der Orange-Dot-Zeile, dass die MP-P-Juristen relativ stabil sind.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_26.wav", "doc_id": "aQpIWggfCo.seg_26", "src_text": "In addition, we reward the script that contains the keywords of the target constraint.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Darüber hinaus vermeiden wir das Skript, das die Schlüsselwörter der Zielbeschränkung enthält.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_657.wav", "doc_id": "FLkGnzVRew.seg_657", "src_text": "Thus, this is the model that we use to cold start the active learning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "mit dem wir beginnen, viel besser ist.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_438.wav", "doc_id": "hgIDlKNiFM.seg_438", "src_text": "Since its release in 2018, BERT has become one of the most effective approach to solve natural language processing tasks and offers huge performance gains compared to historical static and contextualized methods such as Word2vec, fastText, or more.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Seit seiner Veröffentlichung im Dezember ist BERT zu einem der effektivsten Ansätze für die Verarbeitung natürlicher Sprache geworden und bietet im Vergleich zu historischen statischen und kontextualisierten Methoden enorme Leistungssteigerungen.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_582.wav", "doc_id": "rISrKoXQCx.seg_582", "src_text": "So if we do not sanitize political opinions in language model training data, the bias would propagate from pretraining data to language models to downstream tasks, ultimately creating fairness issues.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn wir politische Meinungen in Sprachmodellierungstrainingdaten nicht saniert haben, würde der Bias sich von vorbereitenden Daten zu Sprachmodellen und schließlich zu Downstream-Aufgaben ausbreiten, was letztendlich zu Fairnessproblemen führen würde.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_55.wav", "doc_id": "TVCREhgqUP.seg_55", "src_text": "These utterances are paired with logical forms that represent core aspects of their meaning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "hatte. Diese Aussagen werden mit logischen Formen gepaart, die die Kernaspekte ihres Sinns darstellen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_612.wav", "doc_id": "oeooqChmKK.seg_612", "src_text": "First, we have the typical setting: \"Background-Pretrain\", where background knowledge is assumed to be available at pretrain time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Erstens den typischen Einstellung: Hintergrund vorbereiten, bei der man davon ausgeht, dass Hintergrundwissen zur Vorbereitung verfügbar ist.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_762.wav", "doc_id": "XejEJmgUmE.seg_762", "src_text": "So why does the match prefix affect the language model judgement so much?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "beeinflussen. Daher, warum beeinflusst der Match-Präfix die Sprachmodellbewertung so stark?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_258.wav", "doc_id": "oYCKgTzTDy.seg_258", "src_text": "We conduct a comprehensive benchmark study on three representative types of multilingual language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir führen eine umfassende Benchmark-Studie zu drei repräsentativen Typen von mehrsprachigen Sprachmodellen durch,", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_712.wav", "doc_id": "oaOHnMCwad.seg_712", "src_text": "We also find most additional alignment with people who have a college education.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir finden auch die meisten zusätzlichen Angaben zu Personen mit Hochschulbildung, so dass", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_120.wav", "doc_id": "uZBWfYjYnf.seg_120", "src_text": "And we also released open source the code and models and simultaneous output to facilitate the reproducibility of our work.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und wir stellen auch Open-Source-Code und Modelle zur Verfügung, um die Reproduzierbarkeit unserer Arbeit zu erleichtern,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_518.wav", "doc_id": "dvGkKzmIaN.seg_518", "src_text": "Therefore, in this paper we propose Embedding marker, which is a backdoor based watermark method applicable to embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher schlagen wir in diesem Papier Embedder vor, eine Backdoor-basierte Methode zur Anwendung auf Embedding- und Dienstleistungen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_581.wav", "doc_id": "rISrKoXQCx.seg_581", "src_text": "It's like between Scylla and Charybdis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "werden, wie zwischen s und kris.", "score": 22.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_693.wav", "doc_id": "oaOHnMCwad.seg_693", "src_text": "Our framework works in two main steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unser Rahmenwerk funktioniert in zwei Hauptschritten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_812.wav", "doc_id": "WTTtiRKFZI.seg_812", "src_text": "So the proportion is bigger of the left short conjunct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist die Proportion größer von dem linken kürzeren Konjunkt,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_736.wav", "doc_id": "XejEJmgUmE.seg_736", "src_text": "And then the hope is that the model, basically, puts more probability to the acceptable sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die Hoffnung, dass das Modell im Grunde mehr Wahrscheinlichkeit auf die akzeptable Situation legt.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_114.wav", "doc_id": "uZBWfYjYnf.seg_114", "src_text": "And we compare with popular strategies that are also applied to offline models that are the Wait-k strategy and the Local Agreement.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir vergleichen mit geeigneten Strategien, die auch auf Offline-Modelle angewendet werden können, nämlich die Whitkey-Strategie und die lokale Vereinbarung,", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_282.wav", "doc_id": "PIZEXUFLAR.seg_282", "src_text": "We use all the instances in the test split for each task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Milleisenas auswählen. Wir verwenden alle Instanzen im Testset für", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_403.wav", "doc_id": "WBLMIsdIrq.seg_403", "src_text": "And we perform our analysis on transcripts of TED talks that have been translated from English to 14 different languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir führen unsere Analysen auf Transkripte von Ted Talks durch, die in vierzehn verschiedenen Sprachen übersetzt wurden.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_868.wav", "doc_id": "GvEBWkLmuI.seg_868", "src_text": "And finally, for black women, we see that some of the top words are things like \"strong\" and \"resilient\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "schließlich sehen wir bei den schwarzen Frauen, dass einige der Top-Wörter Dinge wie stark und widerstandsfähig sind.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_505.wav", "doc_id": "dvGkKzmIaN.seg_505", "src_text": "Currently, large language models such as GPT, LLAMA, PALM are exceptional in natural language understanding and generation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "zunächst die Hintergründe der Einbettung von Diensten erläutern. Derzeit sind große Sprachmodelle wie TpT, Llama, Palm im Bereich des natürlichen Sprachverständnisses und der Sprachgenerierung", "score": 13.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_285.wav", "doc_id": "PIZEXUFLAR.seg_285", "src_text": "During training, we mix all the instances for all the tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "während des Trainings mischen wir alle Instanzen für alle Aufgaben,", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_401.wav", "doc_id": "WBLMIsdIrq.seg_401", "src_text": "We can think of words that have high P-CXMI as ones that require context for translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir können annehmen, dass Wörter mit hohem XMI eine Übersetzung erfordern.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_621.wav", "doc_id": "oeooqChmKK.seg_621", "src_text": "We evaluate the data set both with human study participants, and established coreference resolution models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir bewerten das Datensatz sowohl mit menschlichen Studienteilnehmern als auch mit etablierten Frage-Antwort-Modellen.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_363.wav", "doc_id": "gGbuDbHhyc.seg_363", "src_text": "Our second finding is that increasing the number of clean validation samples will help WSL approaches to achieve better performance, as shown in the figure on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unsere zweite Erkenntnis ist, dass die Erhöhung der Anzahl der Reinheitsvalidierungsproben den WSS-Ansätzen helfen wird, eine bessere Leistung zu erzielen, wie in der Abbildung auf der linken Seite dargestellt.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_285.wav", "doc_id": "PIZEXUFLAR.seg_285", "src_text": "During training, we mix all the instances for all the tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "während des Trainings vermischen wir alle Instanzen für alle Aufgaben;", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_36.wav", "doc_id": "aQpIWggfCo.seg_36", "src_text": "To ensure the quality of the validation and test set, we ask crowd-sourced workers to find and revise the incorrect samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "um die Qualität der Validierung und Teststätten zu gewährleisten, und bitten Crowdsource-Worker, die unkorrekten Muster zu finden und zu korrigieren.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_757.wav", "doc_id": "XejEJmgUmE.seg_757", "src_text": "Now, what happens when we choose sentences from the same data set?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Was passiert nun, wenn wir Sätze aus demselben Datensatz auswählen?", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_295.wav", "doc_id": "PIZEXUFLAR.seg_295", "src_text": "Also, transfer learning from natural instruction dataset can benefit instruction tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "natürlichen Sprachdatensätzen kann beim Anpassen von Anweisungen helfen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_530.wav", "doc_id": "dvGkKzmIaN.seg_530", "src_text": "Copyright verification is to detect whether a model behind another service contains the word mark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Urheberrechtsprüfung soll feststellen, ob ein Modell hinter einem anderen Dienst die Wasserzeichen enthält.", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_567.wav", "doc_id": "rISrKoXQCx.seg_567", "src_text": "We separately pretrain language models on the two different temporal corpora.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und trennen Sprachmodelle auf zwei verschiedene zeitliche Korpora.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_740.wav", "doc_id": "XejEJmgUmE.seg_740", "src_text": "We're trying to revisit the MPP pipeline by asking the model to evaluate acceptability on longer and longer sequences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "versuchen, indem wir bitten, das Modell zu überprüfen, um die Akzeptanz auf längere und längere Sequenzen zu bewerten.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_174.wav", "doc_id": "SLpqvupgvW.seg_174", "src_text": "Our data set covers three different domains: music, books, and recipes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "unser Datensatz umfasst drei verschiedene Bereiche: Musik, Bücher und Rezepte.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_384.wav", "doc_id": "WBLMIsdIrq.seg_384", "src_text": "A Data-driven, Multilingual Exploration\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "vorstellen.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_817.wav", "doc_id": "WTTtiRKFZI.seg_817", "src_text": "Here we have coordination of two verbs and there's no outsides, external governor.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Außenstelle des Gouverneurs nicht zusteht, die beiden Länder gegeneinander auszuspielen.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_91.wav", "doc_id": "TVCREhgqUP.seg_91", "src_text": "If you want to learn more about our experiments and how we address these challenges, please have a look at our paper or come to our poster.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn Sie mehr über unsere Experimente und wie wir diese Herausforderungen angehen möchten, bitten wir Sie, unsere Publikation oder unsere Poster zu besuchen.", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_28.wav", "doc_id": "aQpIWggfCo.seg_28", "src_text": "With our method, InstructGPT can generate scripts of higher quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Mit unserer Methode kann Insensitivität zu höherer Qualität führen.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_684.wav", "doc_id": "oaOHnMCwad.seg_684", "src_text": "Positionality is simply the perspectives that people hold as a result of their demographics, identity, and life experiences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Positionierung ist einfach die Perspektiven, die Menschen aufgrund ihrer Demografie, Identität und Lebenserfahrungen haben.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_458.wav", "doc_id": "hgIDlKNiFM.seg_458", "src_text": "Which is not the case for the model based on CamemBERT weights and tokenizer, which suffer from stability issues.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist nicht der Fall für das Modell, das auf Camberweights und Token basiert, die aus Stabilitätsgründen stammen.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_368.wav", "doc_id": "gGbuDbHhyc.seg_368", "src_text": "Finally, the performance improvement claimed in previous WSL approaches can be easily achieved by allowing to continue fine-tuning on the clean validation samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Schließlich kann die Leistung, die in früheren WS-L-Annäherungen behauptet wurde, leicht erreicht werden, indem man weiterhin feine Anpassungen auf sauberen Validierungssampeln vornimmt.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_646.wav", "doc_id": "FLkGnzVRew.seg_646", "src_text": "We used dissonance-first approach, as seen in the flow chart here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir verwenden den Distanz-First-Ansatz, wie in der Flowchart hier zu sehen ist. „Tweets“ werden mit einem", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_237.wav", "doc_id": "oYCKgTzTDy.seg_237", "src_text": "And during inference we can use this model to translate German queries or Chinese queries, et cetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und können dieses Modell während des Lernens verwenden. Um deutsche oder chinesische Anfragen zu übersetzen. Und wir berücksichtigen", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_709.wav", "doc_id": "oaOHnMCwad.seg_709", "src_text": "For example, we find that data sets and models are most aligned to English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Beispiel finden wir heraus, dass die Datenmodelle für die meisten englischsprachigen Länder am besten geeignet sind,", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_647.wav", "doc_id": "FLkGnzVRew.seg_647", "src_text": "Tweets were passed using the PDTB parser, and pairs of discourse units were annotated according to the guidelines that are described in our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "„PDT-Parser“ verarbeitet und „Paare von Diskussionsgruppen“ werden entsprechend der in der Leitlinie beschriebenen Anweisungen annotiert.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_848.wav", "doc_id": "GvEBWkLmuI.seg_848", "src_text": "So the Marked Words method draws upon the sociolinguistic concept of \"markedness\", which states that there is an unmarked default, and any group that differs from that default is linguistically marked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Markierungsmethode bezieht sich auf den soziolinguistischen Begriff der Markiertheit, der besagt, dass es sich um eine unmarkierte Gruppe handelt. Also zum Beispiel das Wort Mann oder Krieger ist normalerweise mit dem", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_439.wav", "doc_id": "hgIDlKNiFM.seg_439", "src_text": "Since then, this model has been adapted to many other languages, like in French with CamemBERT, and also in domains like biomedical with PubMedBERT and BioBERT and on clinical with ClinicalBERT, but mostly in English.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Seitdem wurde dieses Modell auf viele andere Sprachen wie Französisch mit CamemBERT, andere Domänen wie Biomedizin mit PubMedBERT und BioBERT, und klinische mit ClinicalBERT, aber hauptsächlich auf Englisch adaptiert.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_540.wav", "doc_id": "dvGkKzmIaN.seg_540", "src_text": "We also validate the covertness of the provided embedding by visualising the embedding of sentences on four dataset [INAUDIBLE 4:39] PCA.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir haben auch die Geheimniskrämerei der vorgestellten Einbettung durch die Visualisierung der Einbettung von Sätzen auf 40. z. v. p. c. A. bestätigt,", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_740.wav", "doc_id": "XejEJmgUmE.seg_740", "src_text": "We're trying to revisit the MPP pipeline by asking the model to evaluate acceptability on longer and longer sequences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir versuchen, die MPB-Pipeline zu überprüfen, indem wir die Modelle bitten, die Akzeptanz auf längere und längere Sequenzen zu bewerten.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_419.wav", "doc_id": "WBLMIsdIrq.seg_419", "src_text": "And finally, we use our benchmark as well as other metrics to evaluate different models on the document-level machine translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und schließlich verwenden wir unseren Benchmark wie andere Matrizen, um verschiedene Modelle auf der Dokumenten-Ebene der Maschinentranslation zu bewerten.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_852.wav", "doc_id": "GvEBWkLmuI.seg_852", "src_text": "So in our method, we first designate what the unmarked and marked groups are, and then we compare the personas using the Fightin’ Words method, which is basically using weighted log-odds ratios to distinguish the top words for each marked group.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In unserer Methode werden zuerst die unmarkierten und markierten Gruppen ermittelt. Und dann vergleichen wir die Personen, die die Fighting Words-Methode verwenden, die im Grunde die Weighted Logit-Methode verwendet, um die Top-Wörter jeder Gruppe zu unterscheiden.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_673.wav", "doc_id": "FLkGnzVRew.seg_673", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Danke.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_462.wav", "doc_id": "hgIDlKNiFM.seg_462", "src_text": "So thank you for this presentation, and we are looking forward to exchange at the poster session in Toronto.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "danken wir Ihnen für diese Präsentation und wir freuen uns darauf, in Toronto bei der Postzustellung Aktionen zu unternehmen.", "score": 58.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_717.wav", "doc_id": "oaOHnMCwad.seg_717", "src_text": "So, given that there is positionality in NLP, what can we do about it?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Angesichts der Tatsache, dass es sich um eine Position in einer LED und LP handelt, was können wir tun?", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_395.wav", "doc_id": "WBLMIsdIrq.seg_395", "src_text": "First, when does translation require context?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Erstens: Wann benötigt eine Übersetzung einen", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_633.wav", "doc_id": "FLkGnzVRew.seg_633", "src_text": "I would like to present our work accepted into ACL 2023 as a long paper, \"Transfer Learning for Dissonance Detection: Addressing the Rare-Class Challenge.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ich würde gerne unsere Arbeit, die in den ACL 23 als langen Papier zur Transfer-Learning für die Erkennung von Dissonanzdetektionen, die sich der seltenen Klasse widmet, angenommen wurde, präsentieren.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_593.wav", "doc_id": "oeooqChmKK.seg_593", "src_text": "But natural language understanding often requires knowledge that is also supplied at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Aber das Verständnis der natürlichen Sprache erfordert oft Wissen, das auch im Zeitraum der Nachsorge bereitgestellt wird.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_29.wav", "doc_id": "aQpIWggfCo.seg_29", "src_text": "Our method greatly improves the planning ability both in semantic completeness and faithfulness to the constraint.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unser Verfahren verbessert die Schmerztoleranz sowohl in der semantischen Vollständigkeit als auch in der Treue gegenüber den Einschränkungen.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_386.wav", "doc_id": "WBLMIsdIrq.seg_386", "src_text": "So a lot of translations depend on context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Viele Übersetzungen hängen vom Kontext ab, zum Beispiel:", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_163.wav", "doc_id": "SLpqvupgvW.seg_163", "src_text": "Consider this alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "möchten. Überlegen Sie sich diese alternative", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_737.wav", "doc_id": "XejEJmgUmE.seg_737", "src_text": "The current MPP pipeline basically doesn't allow us to evaluate a model's acceptance towards longer sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die aktuelle MPP-Pipeline ermöglicht es uns im Grunde nicht, die Akzeptanz eines Modells für längere Sätze zu bewerten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_73.wav", "doc_id": "TVCREhgqUP.seg_73", "src_text": "This makes our approach quite flexible and expressive.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "hat. Dies macht unseren Ansatz sehr flexibel und ausdrucksstark.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_454.wav", "doc_id": "hgIDlKNiFM.seg_454", "src_text": "However, we can observe that data from heterogeneous sources appear to be more versatile.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir können jedoch feststellen, dass Daten aus heterogenen Quellen vielseitiger zu sein scheinen,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_245.wav", "doc_id": "oYCKgTzTDy.seg_245", "src_text": "And we evaluate on mT5 and XLM-R + PTR on multilingual setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir evaluieren das auf Mf fünf und das Beispiel Xlm plus Pd auf mehrsprachige Einstellungen.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_202.wav", "doc_id": "SLpqvupgvW.seg_202", "src_text": "For example, the one with the piano music.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zum Beispiel der mit der Klaviermusik.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_668.wav", "doc_id": "FLkGnzVRew.seg_668", "src_text": "However, the annotators also find the examples difficult.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Annotatoren stellen auch heraus, dass die Beispiele schwierig sind.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_105.wav", "doc_id": "uZBWfYjYnf.seg_105", "src_text": "Our solution is to propose EDAtt, or Encoder-Decoder Attention, and it is a strategy for which we decide whether to emit or not a partial translation, based on where attention points to.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unsere Lösung besteht darin, die Aufmerksamkeit zu fokussieren oder zu kodieren, und es ist eine Strategie, bei der wir entscheiden, ob wir eine partielle Übersetzung vornehmen oder nicht, basierend auf den Punkten der Aufmerksamkeit.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_234.wav", "doc_id": "oYCKgTzTDy.seg_234", "src_text": "We also test Monolingual Few-shot setting by training monolingual models with only 10% of training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir testen auch die monolingualen Einstellungen, indem wir mit nur zwölf Prozent der Trainingsdaten monolinguale Modelle trainieren.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_523.wav", "doc_id": "dvGkKzmIaN.seg_523", "src_text": "The trigger set is a group of words in a moderate frequency interval.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die Antriebsmenge ist eine Gruppe von Wörtern in einem moderaten Frequenzintervall.", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_359.wav", "doc_id": "gGbuDbHhyc.seg_359", "src_text": "First, we find that, interestingly, recent WSL methods indeed require clean validation samples to work properly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir fest, dass interessante neuere WSL-Methoden tatsächlich saubere Validierungsmuster erfordern, um richtig zu funktionieren.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_510.wav", "doc_id": "dvGkKzmIaN.seg_510", "src_text": "To protect the copyright of embedding as services, one of the solutions is to embed a watermark in the provider service and detect whether another service contain the watermark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um das Urheberrecht von eingebetteten Diensten zu schützen, wird eine der Lösungen ein Wasserzeichen in den Dienst des Anbieters eingebettet und festgestellt, ob ein anderes Dienst das Wasserzeichen enthält.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_248.wav", "doc_id": "oYCKgTzTDy.seg_248", "src_text": "I think this is known as the \"Curse of Multilinguality\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "erzielt. Ich glaube, das ist ein Fluch der Vielsprachigkeit.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_772.wav", "doc_id": "WTTtiRKFZI.seg_772", "src_text": "As you may know, there are different dependency structures assumed by different theories and corpus approaches.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wie Sie vielleicht wissen, werden verschiedene Abhängigkeitsstrukturen von verschiedenen Theorien und Korpusansätzen angenommen, also", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_235.wav", "doc_id": "oYCKgTzTDy.seg_235", "src_text": "And we test Multilingual Model which we train one multilingual model for all languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und es hat ein monolinguales Modell, das wir für alle Sprachen trainieren.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_43.wav", "doc_id": "aQpIWggfCo.seg_43", "src_text": "We use large language models to generate a high-quality script dataset, CoScript, for constrained language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wen wenn nicht?", "score": 41.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_95.wav", "doc_id": "uZBWfYjYnf.seg_95", "src_text": "And what are the problems of the current SimulST models?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und was sind die Probleme der aktuellen SimulST-Modelle?", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_189.wav", "doc_id": "SLpqvupgvW.seg_189", "src_text": "When we move higher in the list, the entities become more similar to each other and it's usually harder to make the disambiguation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn wir uns weiter oben auf der Liste bewegen, werden die Entitäten sich gegenseitig ähnlicher und es ist in der Regel schwieriger, die Abweichung zu erkennen.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_751.wav", "doc_id": "XejEJmgUmE.seg_751", "src_text": "Finally, we can choose sentences from a completely unrelated domain such as Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Schließlich können wir Sätze aus einem völlig unabhängigen Domäne, wie z.B. einer", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_709.wav", "doc_id": "oaOHnMCwad.seg_709", "src_text": "For example, we find that data sets and models are most aligned to English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel finden wir heraus, dass die Datensätze und Modelle am meisten mit englischsprachigen Ländern ausgerichtet sind.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_155.wav", "doc_id": "wLqFAuDnKa.seg_155", "src_text": "However, the \"Style/Awkward\" category for PaLM is lower than for the state-of-the-art systems, which is an additional signal that PaLM provides really fluent output, but still with some problems of accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Allerdings ist die Kategorie „Stil“ für Panm niedriger als für die neuesten Systeme, was ein zusätzlicher Hinweis ist. Doch die Ausgabe ist wirklich fließend, aber mit einigen Problemen der Genauigkeit. Das", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_689.wav", "doc_id": "oaOHnMCwad.seg_689", "src_text": "So prior work has suggested some anecdotal evidence of having positionality, such as cultural gaps and models and data sets, as well as theoretical definitions of model positionality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "So legen Prinziparbeiter einige anekdotische Beweise für die Positionierung voraus, wie kulturelle Lücken und Modelle und Datensätze, sowie die tatsächliche Definition der Modellposition.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_262.wav", "doc_id": "oYCKgTzTDy.seg_262", "src_text": "Thanks for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dank fürs Zuhören.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_645.wav", "doc_id": "FLkGnzVRew.seg_645", "src_text": "To the goal of creating a cognitive dissonance resource, we conducted a large scale annotation of dissonance relations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zum Zweck der Schaffung einer kognitiven Distanzressource haben wir eine große Anzahl von Distanzbeziehungen hergestellt.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_113.wav", "doc_id": "uZBWfYjYnf.seg_113", "src_text": "But also we want that they are shifted on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "auf diesem Plot. Aber auch wir wollen, dass sie auf der linken Seite verschoben werden.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_813.wav", "doc_id": "WTTtiRKFZI.seg_813", "src_text": "But what's novel in this paper is that we observed that this tendency only occurs when the governor is on the left or absent.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Was jedoch neu ist, ist, dass wir festgestellt haben, dass diese Tendenz nur auftritt, wenn der Gouverneur links ist oder abwesend ist. Richtig, also", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_467.wav", "doc_id": "SUkmfOTvGi.seg_467", "src_text": "We observe that models have been used in CoNLL-2003 to develop NER for almost 20 years and this naturally raises several problems.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir beobachten, dass Modelle fast 20 Jahre lang Kondor verwenden, um NER zu entwickeln, und das bringt natürlich mehrere Probleme mit sich, zum Beispiel:", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_626.wav", "doc_id": "oeooqChmKK.seg_626", "src_text": "Additional experiments with fictional knowledge indicated even the best performing models, cannot reliably integrate backward knowledge provided only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zusätzliche Experimente mit fiktivem Wissen zeigen, dass selbst die besten Modelle das Hintergrundwissen nicht zuverlässig integrieren können.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_847.wav", "doc_id": "GvEBWkLmuI.seg_847", "src_text": "The benefit of this is that we get really specific stereotypes and patterns, without having to rely on any specific lexicon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Der Vorteil davon ist, dass wir wirklich spezifische Stereotypen und Muster erhalten, ohne uns auf einen spezifischen Lexikon verlassen zu müssen.", "score": 58.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_797.wav", "doc_id": "WTTtiRKFZI.seg_797", "src_text": "It's okay the way instead of \"it\", we have this long NP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist absolut faszinierend, ich bin okay, anstatt davon, dass wir die lange und lange Pinguine haben.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_468.wav", "doc_id": "SUkmfOTvGi.seg_468", "src_text": "Firstly, can these models generalise to modern data?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Erstens, können diese Modelle auf moderne Daten generalisiert werden?", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_665.wav", "doc_id": "FLkGnzVRew.seg_665", "src_text": "On further rounds of AL with two best strategies, we improve dissonance classification AUC to 0.75, which is the best performance that we have on the task so far.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In weiteren Runden von AL mit zwei der besten Strategien verbesserten wir die Distanzklasse AUC auf 0,75, was bis jetzt die beste Leistung ist, die wir auf der Aufgabe erreicht haben.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_264.wav", "doc_id": "PIZEXUFLAR.seg_264", "src_text": "So with the advances in large language models, many works started to explore new learning paradigms of reusing pre-trained language models for different downstream tasks in a parameter and data-efficient way.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Mit den Fortschritten bei großen Sprachmodellen begannen viele Arbeiten, neue Lernparadigmen für die Wiederverwendung von vortrainierten Sprachmodellen für verschiedene Downstream-Aufgaben zu erforschen. Viele Studien", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_443.wav", "doc_id": "hgIDlKNiFM.seg_443", "src_text": "To answer this question, we compare DrBERT with our ChuBERT model, which is based on anonymized data obtained from the Nantes University Hospital data warehouse.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um diese Frage zu beantworten, vergleichen wir Dr. Bert mit unserem Schulbert-Modell, das auf anonymisierten Daten basiert, die wir aus dem Non University Hospital Data House erhalten.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_59.wav", "doc_id": "TVCREhgqUP.seg_59", "src_text": "In particular, they often fail to reproduce the systematic correspondences between input and output, such as those that are color-coded in the example.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Insbesondere scheiterten sie oft daran, die systematischen Korrespondenzen zwischen Input und Output zu reproduzieren, wie die, die im Beispiel farbcodiert sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_268.wav", "doc_id": "PIZEXUFLAR.seg_268", "src_text": "Additionally, at the time of our research, we discovered a considerable discrepancy in the availability of instructional datasets between NLP and multi-modal.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Darüber hinaus entdeckten wir in der Zeit unserer Forschung eine beträchtliche Diskrepanz in der Verfügbarkeit von Trainingsdatensätzen zwischen LBP und Multimodal.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_860.wav", "doc_id": "GvEBWkLmuI.seg_860", "src_text": "So instead to do that, we'll turn to the results from our Marked Words method to show how these positive-seeming words facilitate stereotypes and essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Stattdessen werden wir uns die Ergebnisse aus der Marktwahl ansehen, um zu zeigen, wie diese positiv erscheinenden Wörter Stereotypen und Stereotypisierungen aufweisen.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_81.wav", "doc_id": "TVCREhgqUP.seg_81", "src_text": "Our model outperforms the others by a large margin on generalization to deeper recursion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unser Modell übertrifft die anderen bei der Generalisierung zu tiefer Rekursion deutlich. Andere", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_696.wav", "doc_id": "oaOHnMCwad.seg_696", "src_text": "And so we opt to re annotate data to get many annotates for instance and to get a rich set of demographic data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "werden, und so versuchen wir, Daten zu wiederauswerten, um viele Annotatoren für jede Instanz zu erhalten und zu teilen. -Set anhand der demografischen Daten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_277.wav", "doc_id": "PIZEXUFLAR.seg_277", "src_text": "We follow the method from OFA and formulate all the tasks in a unified sequence-to-sequence format.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir folgen dem von OFA gegebenen Leitfaden und formulieren alle Aufgaben in einer sequenz-zu-sequenz-Formatierung,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_642.wav", "doc_id": "FLkGnzVRew.seg_642", "src_text": "High cognitive dissonance is also related to anxiety disorders and can help understand people's mental health better.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Eine hohe kognitive Distanz ist auch mit Angststörungen verbunden und kann helfen, Menschen ihre geistige Gesundheit besser zu verstehen.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_143.wav", "doc_id": "wLqFAuDnKa.seg_143", "src_text": "It's the examples that carry most of the weight.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "größten Teil des Gewichts haben. Die", "score": 9.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_776.wav", "doc_id": "WTTtiRKFZI.seg_776", "src_text": "So these two approaches are asymmetric.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "also diese", "score": 3.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_786.wav", "doc_id": "WTTtiRKFZI.seg_786", "src_text": "OK.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "das", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_311.wav", "doc_id": "dJGfOSFgZO.seg_311", "src_text": "This work was done by the Emory NLP Lab led by Professor Jinho Choi at Emory University and in collaboration with Amazon Alexa AI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Arbeit wurde vom Emery Lab der Universität von Emery geleitet, unter der Leitung von Professor Gino Ochoa und in Zusammenarbeit mit Amazon Alexa AI. Lassen", "score": 36.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_394.wav", "doc_id": "WBLMIsdIrq.seg_394", "src_text": "In this work, we try to answer these two questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Arbeit versuchen wir, diese beiden Fragen zu beantworten:", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_471.wav", "doc_id": "SUkmfOTvGi.seg_471", "src_text": "To investigate these problems, we developed the CoNLL++ Dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um diese Probleme zu untersuchen, entwickeln wir das Daten-Satz Carneal", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_16.wav", "doc_id": "aQpIWggfCo.seg_16", "src_text": "Then we conduct detailed analysis to investigate why learning models fail.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dann führen wir detaillierte Analysen durch, um zu ergründen, was die Landmodellfunktionen sind. Die Ergebnisse", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_27.wav", "doc_id": "aQpIWggfCo.seg_27", "src_text": "We only keep the script if the target goal scores the highest in the goal set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wenn das Skript die höchste Punktzahl im Zielbereich hat.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_513.wav", "doc_id": "dvGkKzmIaN.seg_513", "src_text": "Second, the watermark should not degrade the utility of the provided embeddings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zweitens sollte die Wasserzeichenmethode die Nützlichkeit der vorgesehenen Einbauten nicht verringern.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_123.wav", "doc_id": "wLqFAuDnKa.seg_123", "src_text": "This is joint work with my colleagues from Google Translate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist eine gemeinsame Arbeit mit meinen Kollegen von Google Translate.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_594.wav", "doc_id": "oeooqChmKK.seg_594", "src_text": "For example, in the sentence, \"John saw the newly elected president on TV.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel sah John in der Sätze den neu gewählten Präsidenten im Fernsehen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_824.wav", "doc_id": "WTTtiRKFZI.seg_824", "src_text": "And we show in the paper how this provides an argument against asymmetric structures of coordination, as these two, and for the symmetric structures, as these two.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und wir zeigen im Papier, wie dies geschieht. 1 bietet ein Argument gegen asymmetrische Koordinierungsstrukturen wie diese beiden und fördert asymmetrische Strukturen wie diese beiden. Sehen", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_107.wav", "doc_id": "uZBWfYjYnf.seg_107", "src_text": "For example, if we receive a speech chunk containing \"I'm going to talk about...\" and our model predicts the translation in German, and we will look at the cross-attention weights, we'll see that the first two words points to the earliest received speech frames, while the last word points to the last received speech frames, as lambda speech frames.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn wir beispielsweise einen Spruchabschnitt erhalten, der „Ich werde darüber sprechen“ enthält, und unser Modell eine Übersetzung ins Deutsche vorhersagt, wird die Übersetzung in der Regel verwendet. Und wir werden uns die Gewichte ansehen. Wir können sehen, dass die ersten beiden Wörter auf die frühesten erhaltenen Sprachrahmen hinweisen, während das letzte Wort auf die letzten erhaltenen Sprachrahmen hinweist („Lambda-Sprachrahmen“).", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_3.wav", "doc_id": "aQpIWggfCo.seg_3", "src_text": "Previous work has exploited language models to plan for abstract goals of stereotypical activities such as \"make a cake\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "vorherigen Arbeit wurden Sprachmodelle genutzt, um für abstrakte Ziele von stereotypischen Aktivitäten wie Make-a-Kick zu planen und zu", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_431.wav", "doc_id": "hgIDlKNiFM.seg_431", "src_text": "Hi, I am Yanis Labrak and I will present you our works on \"DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical Domains.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo, ich bin Yanislav und werde Ihnen unsere Arbeiten in Französisch für biomedizinische und klinische Bereiche vorstellen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_753.wav", "doc_id": "XejEJmgUmE.seg_753", "src_text": "So how does the model do?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "uns ansehen. Also, wie funktioniert das Modell?", "score": 58.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_566.wav", "doc_id": "rISrKoXQCx.seg_566", "src_text": "So we divide pretraining corpora, into pre 45th president of the United States and after 45th president of the United States.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir unterteilen also das Präsidium der Vereinigten Staaten in zwei verschiedene temporale Korpora und das Präsidium der Vereinigten Staaten", "score": 52.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_101.wav", "doc_id": "uZBWfYjYnf.seg_101", "src_text": "First, to use already existing offline ST models without re-training or adopting specific architecture for SimulST.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zunächst verwenden wir bereits bestehende Offline-CT-Modelle ohne Wiedertrainieren oder spezifische Architekturen für CT-CT anpassen:", "score": 56.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_274.wav", "doc_id": "PIZEXUFLAR.seg_274", "src_text": "For investigating multi-modal instruction tuning on our proposed dataset, we take OFA, a unified multi-modal pre-trained model, as our base model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Für die Untersuchung der mehrmodalen Anweisung auf unserem vorgeschlagenen Datensatz nehmen wir Ofa als unser Basismodell, Ofa", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_495.wav", "doc_id": "SUkmfOTvGi.seg_495", "src_text": "So going back to the question that we raised in the title of our paper Do CoNLL-2003 taggers still work in 2023?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zurück zu der Frage, die wir in unserem Bericht aufgeworfen haben: Funktionieren die Cornel-Tagger noch im Jahr 2003? Und", "score": 36.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_301.wav", "doc_id": "PIZEXUFLAR.seg_301", "src_text": "As we can see by transfer learning from natural instruction datasets, the model can achieve much better sensitivity compared to the original OFA model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Transferlernen von den Datensätzen der natürlichen Anweisung kann das Modell im Vergleich zum ursprünglichen OA-Modell eine viel höhere Empfindlichkeit erreichen.", "score": 51.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_68.wav", "doc_id": "TVCREhgqUP.seg_68", "src_text": "Our approach predicts the output from the input in two steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unser Ansatz prognostiziert den Output aus dem Input in zwei Schritten. Zuerst", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_723.wav", "doc_id": "oaOHnMCwad.seg_723", "src_text": "I mean, we want to emphasise that inclusive NLP isn't just making.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist die Masakani-Initiative. Ich möchte betonen, dass die inklusive NLP nicht nur alle", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_258.wav", "doc_id": "oYCKgTzTDy.seg_258", "src_text": "We conduct a comprehensive benchmark study on three representative types of multilingual language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir führen eine umfassende Benchmark-Studie zu drei Vertretern von Typen von Mehrsprachigen Modellen durch,", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_288.wav", "doc_id": "PIZEXUFLAR.seg_288", "src_text": "In each experiment, we report the min and max performance and the standard deviation of the performance across all 5 experiments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "in jedem Experiment bewerten. Wir berichten über die mittlere und maximale Leistung und die Standardabweichung der Leistung in allen fünf Experimenten.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_583.wav", "doc_id": "rISrKoXQCx.seg_583", "src_text": "If we do try to sanitaze somehow, we would also risk censorship, or exclusion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn wir versuchen würden, in irgendeiner Weise zu sanitieren, würden wir auch Zensur oder Auslassungen riskieren, und", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_179.wav", "doc_id": "SLpqvupgvW.seg_179", "src_text": "In the second speech bubble, Alice says, \"Do you mean 'Easy on Me' or 'I Gotta Feeling'?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In der zweiten Sprechblase sagt Alice: „Meinst du leicht von mir oder habe ich ein Gefühl?“", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_829.wav", "doc_id": "GvEBWkLmuI.seg_829", "src_text": "This work is done in collaboration with Esin Durmus and Dan Jurafsky.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "in Zusammenarbeit mit Esender und Danroski durchgeführt.", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_564.wav", "doc_id": "rISrKoXQCx.seg_564", "src_text": "For example, for RoBERTa further trained on the left-leaning Reddit corpus we can see a substantial liberal shift in terms of its political biases.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel, für Roberta, weiter trainiert auf dem linken linkierten Korpus, können wir einen substantiellen liberalen Verschiebung in In Bezug auf die politischen Vorurteile", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_198.wav", "doc_id": "SLpqvupgvW.seg_198", "src_text": "Here's for example, the Google search result for the song \"Easy on Me.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier ist zum Beispiel das Google-Suchergebnis für das Lied „Easy Annie“. Für", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_778.wav", "doc_id": "WTTtiRKFZI.seg_778", "src_text": "They single out one of the conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ansätze sind symmetrisch. Jetzt sind auch die", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_498.wav", "doc_id": "SUkmfOTvGi.seg_498", "src_text": "And lastly, please make sure to check out our paper, our data set and if you have any questions, feel free to contact me.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "bitte, überprüfen Sie unser Papier, unsere Datenbank, und wenn Sie irgendwelche Fragen haben, können Sie sich frei mit mir in Verbindung setzen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_260.wav", "doc_id": "oYCKgTzTDy.seg_260", "src_text": "And et cetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "usw. und wir", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_26.wav", "doc_id": "aQpIWggfCo.seg_26", "src_text": "In addition, we reward the script that contains the keywords of the target constraint.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Darüber hinaus achten wir auf das Skript, das die Schlüsselwörter des Zielkontrahnts enthält.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_821.wav", "doc_id": "WTTtiRKFZI.seg_821", "src_text": "So I'll concentrate on the right one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ermittelt werden kann. Was wir sagen, ist, dass die Regierung auf der", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_469.wav", "doc_id": "SUkmfOTvGi.seg_469", "src_text": "And when we develop new taggers, what is needed for good generalization?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wenn wir neue Tags entwickeln, was ist für eine gute Generalisierung erforderlich?", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_522.wav", "doc_id": "dvGkKzmIaN.seg_522", "src_text": "Before these main steps, we first select a trigger set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hauptschritte durchgehen, wählen wir zunächst ein Triggerset.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_674.wav", "doc_id": "oaOHnMCwad.seg_674", "src_text": "Hi everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo,", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_752.wav", "doc_id": "XejEJmgUmE.seg_752", "src_text": "So this will tell us like whether the models acceptability judgments are actually impacted by any context, like, whether the context is coming from a different subset of the data set, or whether it's like completely irrelevant, to the current like to the sentence that we are looking at.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "anderen Sprache oder einer anderen Domäne, auswählen, um die Akzeptanzfähigkeit der Modelle zu testen. Wikipedia, auswählen. Erzählen Sie uns, ob die Akzeptabilitätsurteile der Modelle tatsächlich von irgendeinem Kontext beeinflusst werden, wie zum Beispiel, ob der Kontext aus einem anderen Teilmenge des Datensatzes kommt oder ob er völlig irrelevant zum aktuellen Satz ist, den wir", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_109.wav", "doc_id": "uZBWfYjYnf.seg_109", "src_text": "If we go on and we receive another speech chunk, and our model predicts other three words and we will look at those cross-attention weights, we will see that no word points to the last lambda speech frames.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn wir weitermachen und wir erhalten einen anderen Speech-Tag und unser Modell prädiziert. Wir werden die drei Wörter in der Reihenfolge, in der sie geordnet sind, und wir werden die Kreuz-Atten-Wege darauf untersuchen. Wir werden sehen, dass kein Wort auf die letzten Lambdas des Speech-Frames zeigt.", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_721.wav", "doc_id": "oaOHnMCwad.seg_721", "src_text": "Our third recommendation is to build specialised datasets and models within 4 specific communities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "dritte Empfehlung ist es, spezielle Datensätze und Modelle in vier spezifischen Gemeinschaften zu erstellen,", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_332.wav", "doc_id": "dJGfOSFgZO.seg_332", "src_text": "These reliable, informative, and distinct ABC-Eval metrics enable us to evaluate conversational AI with a higher resolution than previous methods are able to achieve.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese zuverlässigen, informierenden und einzigartigen ABC-EVL-Metriken ermöglichen es uns, die konversationelle AI mit einer höheren Auflösung zu bewerten als es die vorherigen Methoden erreichen können.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_238.wav", "doc_id": "oYCKgTzTDy.seg_238", "src_text": "And we also consider Cross-lingual Zero-shot and Few-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wir erwägen auch die Transferierung von Zero-Shot- und Feature-Transfer", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_564.wav", "doc_id": "rISrKoXQCx.seg_564", "src_text": "For example, for RoBERTa further trained on the left-leaning Reddit corpus we can see a substantial liberal shift in terms of its political biases.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Beispielsweise können wir bei Roberta eine weitergehende Finanzierung des linken Korpus sehen. In Bezug auf seine politischen Vorurteile.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_87.wav", "doc_id": "TVCREhgqUP.seg_87", "src_text": "We address this by inducing the alignment as part of the training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir behandeln dies, indem wir die Ausrichtung als Teil des Trainings induzieren.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_15.wav", "doc_id": "aQpIWggfCo.seg_15", "src_text": "We find that all language models achieve unsatisfactory results on planning for specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir stellen fest, dass alle linearen Modelle unzufriedenstellende Ergebnisse bei der Planung für spezifische Ziele erzielen.", "score": 53.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_9.wav", "doc_id": "aQpIWggfCo.seg_9", "src_text": "A good planner should write scripts that are reasonable and faithful to constraints.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ein guter Planer sollte Skripte schreiben, die vernünftig und den Einschränkungen treu sind.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_52.wav", "doc_id": "TVCREhgqUP.seg_52", "src_text": "As usual, we have a training set of utterances.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "es so aus, als hätten Sie in", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_112.wav", "doc_id": "uZBWfYjYnf.seg_112", "src_text": "So we want our curves to be as high as possible on this plot.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dass unsere Queue so hoch wie möglich auf", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_455.wav", "doc_id": "hgIDlKNiFM.seg_455", "src_text": "We also observe that using more data translated to better performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "um mehrsprachig zu sein, und dass die Verwendung mehrerer Daten zu besseren Leistungen führt.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_371.wav", "doc_id": "gGbuDbHhyc.seg_371", "src_text": "So in practice, there's no reason to choose more complex WSL methods which require more computation time and disk space.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher gibt es keinen Grund, in der Praxis komplexere WS-L-Methoden zu wählen, die mehr Berechnungszeit und Diskraum benötigen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_659.wav", "doc_id": "FLkGnzVRew.seg_659", "src_text": "\"Cumulative\" accumulates all the data collected from active annotation so far, whereas \"Iterative\" updates the model by training on the latest set of data collected.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Alle Daten aus den aktiven Lern- und Anmerkungsrunden werden kumuliert, um das Modell schrittweise zu aktualisieren. Bei den", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_348.wav", "doc_id": "gGbuDbHhyc.seg_348", "src_text": "If we directly train neural networks on weakly labeled data, the neural networks tend to memorize the label noise and do not generalize.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn wir neuronale Netze direkt trainieren und schwach beschriftete Daten haben, tendieren die neuronalen Netze dazu, den Label-Rauschen zu memorieren und nicht zu", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_244.wav", "doc_id": "oYCKgTzTDy.seg_244", "src_text": "We found that Encoder-Decoder obtains the best performance on all nine datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellten fest, dass der Encoder-Decoder die beste Leistung auf allen neun Datensätzen erzielt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_136.wav", "doc_id": "wLqFAuDnKa.seg_136", "src_text": "And this can go, in extreme cases, up to 40 BLEURT points.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und dies kann in extremen Fällen bis zu 40 Punkte", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_774.wav", "doc_id": "WTTtiRKFZI.seg_774", "src_text": "So in this case, Lisa.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist, in diesem Fall Lisa.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_744.wav", "doc_id": "XejEJmgUmE.seg_744", "src_text": "And what we do is that to recreate like longer sequences and which are acceptable and which has the same matching of the grammatical structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und was wir tun, um längere Sequenzen zu erzeugen, die akzeptabel sind und die gleiche grammatikalische Struktur haben,", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_279.wav", "doc_id": "PIZEXUFLAR.seg_279", "src_text": "Ok, now I'm going to talk about multi-modal instruction tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ok, ich werde jetzt über die Multi-Modell-Unterstützung sprechen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_324.wav", "doc_id": "dJGfOSFgZO.seg_324", "src_text": "For comparison, we also evaluated these conversations using three existing methods: Likert ratings on the turn-level, Likert ratings on the dialogue-level, and dialogue-level pairwise comparisons.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zum Vergleich bewerteten wir diese Gespräche mit drei existierenden Methoden: Likert-Skalen auf der Wendebene, Likert-Skalen auf der Dialogebene und Paarweisen-Vergleiche auf der Dialogebene.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_192.wav", "doc_id": "SLpqvupgvW.seg_192", "src_text": "The third one is when they have similar descriptions on Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der dritte ist, wenn sie ähnliche Beschreibungen auf Wikipedia haben", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_212.wav", "doc_id": "SLpqvupgvW.seg_212", "src_text": "We've also shown that the models are domain-generalizable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir zeigen auch, dass die Modelle domänenübergreifend sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_402.wav", "doc_id": "WBLMIsdIrq.seg_402", "src_text": "Now we analyze words with high P-CXMI to look for patterns between these words.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Jetzt analysieren wir Wörter mit hohen PSMI, um nach Übereinstimmungen zwischen diesen Wörtern zu suchen.", "score": 37.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_381.wav", "doc_id": "gGbuDbHhyc.seg_381", "src_text": "Please feel free to check it out.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "finden können. Bitte fühlen Sie sich frei, ihn zu überprüfen.", "score": 58.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_107.wav", "doc_id": "uZBWfYjYnf.seg_107", "src_text": "For example, if we receive a speech chunk containing \"I'm going to talk about...\" and our model predicts the translation in German, and we will look at the cross-attention weights, we'll see that the first two words points to the earliest received speech frames, while the last word points to the last received speech frames, as lambda speech frames.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Beispielsweise, wenn wir einen Sprachblock mit der Übersetzung „Ich werde darüber sprechen“ erhalten und unser Modell eine Übersetzung in Deutsch vorhersagt. Und wir werden auf die Querverbindung achten. Wir werden sehen, dass die ersten beiden Wörter auf die frühesten erhaltenen Sprachrahmen verweisen, während das letzte Wort auf die letzten erhaltenen Sprachrahmen verweist.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_456.wav", "doc_id": "hgIDlKNiFM.seg_456", "src_text": "Overall, from-scratch pre-training seems to obtain higher performance on most of the tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Insgesamt scheint das Training mit Schrammen eine höhere Leistung auf den meisten Aufgaben zu erzielen. Unsere Experimente mit kontinuierlicher", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_625.wav", "doc_id": "oeooqChmKK.seg_625", "src_text": "This suggests that when trained on generic reference resolution data sets, most learn to exploit surface cues, which are not useful when testing on KITMUS where such queues have been removed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies deutet darauf hin, dass trainierte Mäuse lernen, Oberflächenhinweise auszunutzen, die bei der Untersuchung von Kitts, wo solche Hinweise entfernt wurden, nicht nützlich sind.", "score": 44.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_531.wav", "doc_id": "dvGkKzmIaN.seg_531", "src_text": "We first construct a back door and a benign data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir bauen zunächst eine Rückwand und einen bösartigen", "score": 49.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_117.wav", "doc_id": "uZBWfYjYnf.seg_117", "src_text": "And we see that it outperforms all the strategies applied to offline models since the curves are shifted over the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir sehen, dass EAD alle Strategien, die auf Offline-Modelle angewendet werden, übertrifft, da ihre Kurven nach links verschoben sind.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_638.wav", "doc_id": "FLkGnzVRew.seg_638", "src_text": "And they have a consonance relationship.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und sie haben eine konsensuale Beziehung. Die Diskrepanz", "score": 43.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_311.wav", "doc_id": "dJGfOSFgZO.seg_311", "src_text": "This work was done by the Emory NLP Lab led by Professor Jinho Choi at Emory University and in collaboration with Amazon Alexa AI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese Arbeit wurde von dem Emory NLP-Labor geleitet von Professor Jino Choi an der Emory University und in Zusammenarbeit mit Amazon Alexa erstellt. Lassen", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_321.wav", "doc_id": "dJGfOSFgZO.seg_321", "src_text": "ABC-Eval is capable of measuring the rates at which chat models will commit various thematic errors.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ABC-EVAL kann die Raten messen, mit denen Chat-Modelle verschiedene thematische Fehler begehen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_191.wav", "doc_id": "SLpqvupgvW.seg_191", "src_text": "The second one is when the entities have similar titles, for example, two books with the name \"The Return\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das zweite ist, wenn die Einheiten ähnliche Titel haben, z. B. zwei Bücher mit dem Namen „The Rite“.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_207.wav", "doc_id": "SLpqvupgvW.seg_207", "src_text": "If the language model has access to the exact same background knowledge as the annotators, then the accuracy is really high, it's around 92 to 95%.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn das Sprachmodell Zugriff auf die exakt gleiche Hintergrundwissensbasis wie die Annotatoren hat, ist die Genauigkeit wirklich hoch: sie liegt bei etwa 92-95. Aber", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_551.wav", "doc_id": "rISrKoXQCx.seg_551", "src_text": "This has created a mixed blessing for language model applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Lobeshymne für eine Sprachmodellanwendung geschaffen. So können sie auf", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_236.wav", "doc_id": "oYCKgTzTDy.seg_236", "src_text": "For example, we put the German, English, Chinese queries together to train a multilingual model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zum Beispiel vereinigen wir die deutschen, englischen und chinesischen Fragen, um ein mehrsprachiges Modell zu trainieren,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_737.wav", "doc_id": "XejEJmgUmE.seg_737", "src_text": "The current MPP pipeline basically doesn't allow us to evaluate a model's acceptance towards longer sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die aktuelle MP-Pipeline erlaubt es uns im Grunde nicht, die Akzeptanz eines Modells für längere Sätze zu bewerten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_51.wav", "doc_id": "TVCREhgqUP.seg_51", "src_text": "In the context of semantic parsing, testing for compositional generalization might look like this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Im Kontext der semantischen Parsenierung, die für die kompositorische Generalisierung getestet wird, könnte", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_856.wav", "doc_id": "GvEBWkLmuI.seg_856", "src_text": "However, when we actually look at the distribution of the words and lexicon, we find very different things.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn wir uns jedoch die Verteilung der Wörter in einem Wörterbuch anschauen, stellen wir jedoch fest, dass es sehr unterschiedliche Dinge", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_478.wav", "doc_id": "SUkmfOTvGi.seg_478", "src_text": "The first one is the model architecture.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die erste ist die Modellarchitektur.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_245.wav", "doc_id": "oYCKgTzTDy.seg_245", "src_text": "And we evaluate on mT5 and XLM-R + PTR on multilingual setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und wir bewerten auf M.T.5 und als Beispiel XLM-R + PDR auf einer multilinguellen Einstellung.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_680.wav", "doc_id": "oaOHnMCwad.seg_680", "src_text": "But that's not really the case for Aditya Sharma.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Aber das ist wirklich der Fall für Aditya Sharma, die", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_521.wav", "doc_id": "dvGkKzmIaN.seg_521", "src_text": "Watermark injection and copyright verification.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wasserzeichen-Injektion und eine Urheberrechtsverwertung.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_718.wav", "doc_id": "oaOHnMCwad.seg_718", "src_text": "So we have a few recommendations for this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also haben wir einige Empfehlungen", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_641.wav", "doc_id": "FLkGnzVRew.seg_641", "src_text": "Studying cognitive dissonance can help us understand the effects of disagreement among people, track trends and belief values, and attitude changes in population.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das Studium der konzeptuellen Distanz kann helfen, die Auswirkungen von Meinungsverschiedenheiten unter Menschen, Trends und Überzeugungen, Werten und Einstellungen in der Bevölkerung zu verstehen.", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_639.wav", "doc_id": "FLkGnzVRew.seg_639", "src_text": "While dissonance is a very common phenomenon we experienced in daily decision making, they are really rare to find expressed in language among other kinds of discourse relations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist ein sehr häufiges Phänomen, das wir in der täglichen Entscheidungsfindung erleben, und sie ist wirklich bereit, in einer Sprache ausdrücklich auszudrücken, die wir in anderen Diskussionen nicht verwenden.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_522.wav", "doc_id": "dvGkKzmIaN.seg_522", "src_text": "Before these main steps, we first select a trigger set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vor diesen Hauptschritten wählen wir zunächst eine Antriebsmenge aus;", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_470.wav", "doc_id": "SUkmfOTvGi.seg_470", "src_text": "At the same time, if we do observe poor generalization, what causes the performance drop of these models?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Gleichzeitig, wenn wir eine schlechte Generalisierung beobachten, was verursacht die Leistungseinbußen dieser Modelle?", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_259.wav", "doc_id": "oYCKgTzTDy.seg_259", "src_text": "And our results show many interesting findings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und unsere Ergebnisse zeigen viele interessante Erkenntnisse,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_654.wav", "doc_id": "FLkGnzVRew.seg_654", "src_text": "We transfer from two different tasks: topic independent dissonance stance classification, a task that determines if two debate statements from different people are in agreement or in disagreement, irrespective of topic, called debate here, and on binary classification of expansion and comparison classes of PDTB since these two are closely related to the conception of consonance and dissonance and we call them CE here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir übertragen von zwei verschiedenen Themen: Thema unabhängig von der Standortklassifizierung, die bestimmt, ob zwei Aussagen von verschiedenen Personen in Übereinstimmung oder in Unmöglichkeit sind, unabhängig vom Thema. Hier wird eine Debatte geführt und über die binäre Klassifizierung von Expansion und Vergleichsklassen von PendetB gesprochen, da diese beiden eng mit dem Konzept von Konsonanten und Dissonanzen verwandt sind, und wir nennen sie hier CeE.", "score": 15.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_857.wav", "doc_id": "GvEBWkLmuI.seg_857", "src_text": "So, while the generated personas have much higher rates of the lexicon words, the human-written ones have a much wider distribution of words, while the stereotype words that are in the generated personas are really just the words \"tall\" and \"athletic\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "die generierten Personas viel höhere Raten der Lexikonwörter haben. Die von Menschen geschriebenen Texte haben eine viel breitere Verteilung von Wörtern, während die stereotypen Wörter, die in den generierten Persönlichkeiten vorkommen, wirklich nur die Wörter \"toll\" und \"athletisch\" sind.", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_830.wav", "doc_id": "GvEBWkLmuI.seg_830", "src_text": "In recent years, many have documented the prevalence of social bias and stereotypes in large language models, or LLMs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In den letzten Jahren haben viele die Vorherrschaft des sozialen Biaßes und Stereotypen in großen Sprachmodellen oder LLMs dokumentiert.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_823.wav", "doc_id": "WTTtiRKFZI.seg_823", "src_text": "But when the governor is on the right this tendency disappears.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "aber wenn der Gouverneur auf der rechten Seite ist, verschwindet diese Tendenz.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_602.wav", "doc_id": "oeooqChmKK.seg_602", "src_text": "Kea is a Baker.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Kiah ist Bäcker.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_350.wav", "doc_id": "gGbuDbHhyc.seg_350", "src_text": "In recent works in WSL, so WSL stands for Weakly Supervised Learning, a common claim is that people say that they only train models on the weakly labeled data and achieve high performance on clean test sets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "WSL ein Akronym für wöchentliches Superwise-Lernen. Eine häufige Behauptung ist, dass Menschen nur Modelle unter wöchentlichem Label-Data trainieren und auf sauberen Test-Sets hohe Leistung erzielen.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_492.wav", "doc_id": "SUkmfOTvGi.seg_492", "src_text": "Our conclusion is that, for good generalization we would need a better model architecture, larger model size, as well as more fine tuning examples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unser Schluss ist, dass wir für eine gute Generalisierung eine bessere Modellarchitektur, eine größere Modellgröße sowie mehr feinjustierte Beispiele", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_217.wav", "doc_id": "oYCKgTzTDy.seg_217", "src_text": "So, semantic parsing is a task to build semantic representations of user queries such as SQL and Lambda Calculus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "semantische Parsierung ist also eine Aufgabe, um semantische Darstellungen von Benutzeranfragen wie Zequel und Lambda-Kalküle zu bauen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_486.wav", "doc_id": "SUkmfOTvGi.seg_486", "src_text": "The second hypothesis is temporal drift which is the performance degradation that is caused by the increasing temporal gap between the train and the test data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die zweite Hypothese ist der temporale Abfall, der durch den zunehmenden zeitlichen Abstand zwischen Zug und Testdaten verursacht wird.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_460.wav", "doc_id": "hgIDlKNiFM.seg_460", "src_text": "We are also observing that more specialized data is better, but it doesn't scale well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dass spezialisierte Daten besser sind, mehr spezialisierte Daten besser sind, aber sie skaliert nicht gut, da", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_303.wav", "doc_id": "PIZEXUFLAR.seg_303", "src_text": "So overall, we propose the first large scale multi-model instruction tuning dataset with significantly improved their short capability of OFA, and we explore different transfer learning technique and show their benefits.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Insgesamt schlagen wir also ein erstes groß angelegtes multimodales Anweisungsjustierung-Datensatz vor, wir verbessern die Echtzeitfähigkeit von OFA erheblich und wir untersuchen verschiedene Transferlernmethoden und zeigen ihre Vorteile.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_194.wav", "doc_id": "SLpqvupgvW.seg_194", "src_text": "For example, the same genre or the same artist for a song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "z. B. das gleiche Genre oder der gleiche Künstler.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_669.wav", "doc_id": "FLkGnzVRew.seg_669", "src_text": "In summary, we find that PRC is a simple AL strategy for rare class acquisition and cold starting AL with appropriately designed transfer learning task and help significantly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "stellen wir fest, dass die PRRC eine einfache AL-Strategie für die Aufnahme in die nächste Klasse ist und hilfreich ist, wenn man sie richtig anwendet.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_799.wav", "doc_id": "WTTtiRKFZI.seg_799", "src_text": "So the reasoning here is that this is possible because even though this sentence violates the general grammatical principle that direct objects should be next to the verb, it satisfies the principle of dependency length minimization, which says that shorter dependencies are preferred.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Bienen, gelesen habe, also der Grund hierfür. Ist das möglich, weil, auch wenn diese Sätze die allgemeine grammatische Regel verletzen, dass ein direktes Objekt direkt nach dem Verb stehen sollte, sie den Prinzip der Abhängigkeitslänge minimierung erfüllen, das besagt, dass kürzere Abhängigkeiten bevorzugt werden.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_438.wav", "doc_id": "hgIDlKNiFM.seg_438", "src_text": "Since its release in 2018, BERT has become one of the most effective approach to solve natural language processing tasks and offers huge performance gains compared to historical static and contextualized methods such as Word2vec, fastText, or more.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Seit seiner Veröffentlichung im Jahr 2018 ist BERT zu einem der effektivsten Ansätze zur Lösung von Aufgaben der natürlichen Sprachverarbeitung geworden und bietet einen enormen Leistungszuwachs im Vergleich zu historischen statischen und kontextualisierten Methoden wie Word2Vec oder GloVe. \"oder was?\".", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_536.wav", "doc_id": "dvGkKzmIaN.seg_536", "src_text": "Meanwhile, we also apply KS test and use its p-value as the third metric.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In der Zwischenzeit wenden wir auch den KS-Test an und verwenden sein p-Wert als dritte Metrik.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_496.wav", "doc_id": "SUkmfOTvGi.seg_496", "src_text": "And we found that the answer is actually a resounding yes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "noch im Jahr 2023? Die Antwort ist tatsächlich ein eindeutiger „Ja“.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_465.wav", "doc_id": "SUkmfOTvGi.seg_465", "src_text": "Let's get started.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "beginnen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_224.wav", "doc_id": "oYCKgTzTDy.seg_224", "src_text": "For example, there's only one single model to evaluate them.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel gibt es nur ein einziges Modell zur Bewertung. Zu", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_466.wav", "doc_id": "SUkmfOTvGi.seg_466", "src_text": "Our paper investigated the problem of generalization using the Named Entity Recognition Task or the NER task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In unserem Papier untersuchten wir das Problem der Generalisierung, indem wir die Aufgabe der Erkennung benannter Entitäten oder die NER-Aufgabe verwendeten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_392.wav", "doc_id": "WBLMIsdIrq.seg_392", "src_text": "Firstly because only a small portion of translations depend on context which makes corpus-level metrics like BLEU unable to capture these translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "einmal, weil nur ein kleiner Teil der Übersetzungen von Kontext abhängt, was es Korpusniveau-Metriken wie Blue unmöglich macht, diese Übersetzungen zu erfassen. Und", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_180.wav", "doc_id": "SLpqvupgvW.seg_180", "src_text": "Which is the alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das ist die alternative", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_290.wav", "doc_id": "PIZEXUFLAR.seg_290", "src_text": "If it's a multi-modal generation task, we report Rouge-L. For NLP task, we report Rouge-L as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn es sich um eine Multimodal-Generation handelt, berichten wir über die RuG. Für NRP-Aufgaben berichten wir auch über die RuG.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_801.wav", "doc_id": "WTTtiRKFZI.seg_801", "src_text": "So here we have a dependency from \"read\" to the adjunct of length 7 measured in words and from \"read\" to \"book\" of length 4, so together it's 11.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier haben wir also die Abhängigkeit von rot bis zum Adjektiv von Länge sieben gemessen in Wörtern und von rot bis zum Buch von Länge vier. Wenn Sie sich bewegen", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_734.wav", "doc_id": "XejEJmgUmE.seg_734", "src_text": "Which can also include grammaticality like BLiMP, SyntaxGym, or acceptability in terms of stereotypes such as CrowS pairs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die auch Grammatikalität wie „Blimp“ oder „Syntax Gem“ oder Akzeptabilität in Bezug auf Stereotypen wie „Crowds“ umfassen können.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_427.wav", "doc_id": "WBLMIsdIrq.seg_427", "src_text": "We also compared different commercial systems and our benchmark shows that DeepL is usually more accurate than Google Translate for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir vergleichen auch unterschiedliche kommerzielle Systeme und unsere Benchmarks zeigen, dass die Google-Übersetzung für lokale Dokumentenübersetzung normalerweise genauer ist.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_63.wav", "doc_id": "TVCREhgqUP.seg_63", "src_text": "This can be complicated and sometimes a computationally expensive process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies kann ein komplizierter und manchmal rechenintensiver Prozess sein.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_276.wav", "doc_id": "PIZEXUFLAR.seg_276", "src_text": "Here we show some example instances from our MultiInstruct dataset, to unify the processing of various input and output data types.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier zeigen wir einige Beispielinstanzen aus unserem Multi-Instanz-Datensatz. Um die Verarbeitung verschiedener Eingabe- und Ausgabedatentypen zu vereinheitlichen,", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_615.wav", "doc_id": "oeooqChmKK.seg_615", "src_text": "This last setting is especially interesting, since it simulates the case where the background knowledge necessary to solve a task is not part of the pretrain data of models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die letzte Einstellung ist besonders interessant, da sie den Fall simuliert, bei dem das Hintergrundwissen zur Lösung einer Aufgabe nicht Teil der vorgefertigten Modelle ist, da sich", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_54.wav", "doc_id": "TVCREhgqUP.seg_54", "src_text": "And \"Mary knew that the girl slept.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und die Mädchen schlafen, und ich neue Mädchen schlafen.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_489.wav", "doc_id": "SUkmfOTvGi.seg_489", "src_text": "And this shows us that adaptive overfitting in this case is not observed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und das zeigt uns, dass in diesem Fall keine anpassungsfähige Überdimensionierung beobachtet wird.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_450.wav", "doc_id": "hgIDlKNiFM.seg_450", "src_text": "In total, we have seven models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Insgesamt haben wir sieben Modelle.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_109.wav", "doc_id": "uZBWfYjYnf.seg_109", "src_text": "If we go on and we receive another speech chunk, and our model predicts other three words and we will look at those cross-attention weights, we will see that no word points to the last lambda speech frames.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn wir fortfahren und einen anderen Sprechrhythmus erhalten, und unser Modell weitere drei Wörter vorhersagt, werden wir auf diese Cross-Attention-Ways schauen. Wir werden sehen, dass keine Worte auf die letzten Lamda-Sprechrahmen verweisen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_140.wav", "doc_id": "wLqFAuDnKa.seg_140", "src_text": "We saw that the actual form of the prompting doesn't have a big influence in the case of several short promptings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir sahen, dass die tatsächliche Form der Anrufung keinen großen Einfluss auf den Fall von mehreren Anrufungen hat.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_558.wav", "doc_id": "rISrKoXQCx.seg_558", "src_text": "So some preliminary results demonstrate that first, language models do have varying political leanings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "vorläufige Ergebnisse, dass die ersten Sprachmodelle immer noch unterschiedliche politische Präferenzen aufweisen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_569.wav", "doc_id": "rISrKoXQCx.seg_569", "src_text": "So this indicates that language models can also pick up the polarisation in our society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "so dass die Sprachmodelle auch die Polarisierung in unserer Gesellschaft aufgreifen können.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_450.wav", "doc_id": "hgIDlKNiFM.seg_450", "src_text": "In total, we have seven models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Insgesamt haben wir sieben Modelle.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_95.wav", "doc_id": "uZBWfYjYnf.seg_95", "src_text": "And what are the problems of the current SimulST models?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und was sind die Probleme der aktuellen SimulST-Modelle?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_851.wav", "doc_id": "GvEBWkLmuI.seg_851", "src_text": "And more broadly, dominant groups in society are both linguistically and socially unmarked, while the marginalized groups are usually marked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "markiert. Und mehr oder weniger, die dominierenden Gruppen in der Gesellschaft sind sowohl sprachlich als auch sozial unmarkiert, während die marginalisierten Gruppen üblicherweise markiert sind. Unsere Methode", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_627.wav", "doc_id": "oeooqChmKK.seg_627", "src_text": "To summarize the main takeaways of our paper, many coreference resolution models appear unable to reason over knowledge from different sources without task-specific training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Um die wichtigsten Aspekte unseres Papiers zusammenzufassen: Viele Korrelationsmodelle scheinen nicht in der Lage zu sein, Wissen aus verschiedenen Quellen ohne spezifisches Training zu nutzen.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_855.wav", "doc_id": "GvEBWkLmuI.seg_855", "src_text": "So first we use a lexicon of stereotypes, and we find that the generated personas contain a lot more stereotypes than the human-written ones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "also verwenden wir zunächst ein Elektronen-Stereotyp und stellen fest, dass die geborene Person eine viel größere Anzahl an Stereotypen enthält als die der Menschen, die sie kennen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_608.wav", "doc_id": "oeooqChmKK.seg_608", "src_text": "And second, background knowledge such as \"Judges decide cases in law courts.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diener ein Richter ist, und zweitens, die Hintergrundkenntnis, wie etwa, dass Richter Fälle in Gerichtshöfen entscheiden.", "score": 43.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_603.wav", "doc_id": "oeooqChmKK.seg_603", "src_text": "Servin and Kea met at a park.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Servin und Kiah trafen sich nach einem", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_876.wav", "doc_id": "GvEBWkLmuI.seg_876", "src_text": "And finally, there should really be increased transparency about bias mitigation methods, because for instance, like these positive stereotypes, we don't know if it's because there is some sort of weird overly-excessive value alignment going on, or maybe some other anti-stereotyping methods that are resulting in these pernicious patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und schließlich sollten die Transparenz über die Bias-Methode wirklich erhöht werden. Weil wir zum Beispiel nicht wissen, ob es auf diese positiven Stereotypen etwas wie „irgendwie seltsam“ gibt. Übermäßig hohe Werte werden aufgenommen, oder vielleicht andere, wie anti-stereotypierende Methoden, die zu diesen schädlichen Mustern führen.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_74.wav", "doc_id": "TVCREhgqUP.seg_74", "src_text": "Conceptually, our permutation model works roughly like this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Konzeptionell funktioniert unser Permutationsmodell ungefähr so.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_274.wav", "doc_id": "PIZEXUFLAR.seg_274", "src_text": "For investigating multi-modal instruction tuning on our proposed dataset, we take OFA, a unified multi-modal pre-trained model, as our base model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "der multimodalen Anweisung in unserem Vorschlag verwenden wir Ofa als ein einheitliches multimodales Darstellungsmodell als Basismodell.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_877.wav", "doc_id": "GvEBWkLmuI.seg_877", "src_text": "We just really can't make any assumptions or really study that further, without more transparency.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das ist wirklich nicht möglich, oder man muss das weiter mit mehr Transparenz untersuchen.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_78.wav", "doc_id": "TVCREhgqUP.seg_78", "src_text": "We determine the third token in the output in a similar way by jumping to another multiset token.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir bestimmen das dritte Token in der Ausgabe auf ähnliche Weise, indem wir zu einem anderen Multisets-Token springen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_679.wav", "doc_id": "oaOHnMCwad.seg_679", "src_text": "Where prospective API is able to detect correctly toxic instances.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "denn Ihre API ist in der Lage, korrekte toxische Einflüsse zu erkennen.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_194.wav", "doc_id": "SLpqvupgvW.seg_194", "src_text": "For example, the same genre or the same artist for a song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "beispielsweise dasselbe Genre oder dasselbe Künstler. Wenn wir diese alternative", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_817.wav", "doc_id": "WTTtiRKFZI.seg_817", "src_text": "Here we have coordination of two verbs and there's no outsides, external governor.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "es eine Koordination von zwei Wörtern gibt, aber keine Außen-Gouverneur.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_740.wav", "doc_id": "XejEJmgUmE.seg_740", "src_text": "We're trying to revisit the MPP pipeline by asking the model to evaluate acceptability on longer and longer sequences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir versuchen, die NPB-Pipeline zu überprüfen, indem wir das Modell bitten, die Akzeptabilität auf längeren und längeren Sequenzen zu bewerten.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_821.wav", "doc_id": "WTTtiRKFZI.seg_821", "src_text": "So I'll concentrate on the right one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "uns auf die rechte Spalte zu konzentrieren.", "score": 4.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_772.wav", "doc_id": "WTTtiRKFZI.seg_772", "src_text": "As you may know, there are different dependency structures assumed by different theories and corpus approaches.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wie wir wissen, gibt es verschiedene Abhängigkeitsstrukturen, die von verschiedenen Theorien und Körperprozessen genutzt werden,", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_161.wav", "doc_id": "SLpqvupgvW.seg_161", "src_text": "My name is Javad Hosseini and this is a joint work with Filip Radlinski, Silvia Pareti, and Annie Louis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Mein Name ist Jawad Hussain, und dies ist eine gemeinsame Arbeit mit Philip Radlinski, Sylvia Parati und Annie Tows.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_728.wav", "doc_id": "XejEJmgUmE.seg_728", "src_text": "Hi, everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo alle,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_173.wav", "doc_id": "SLpqvupgvW.seg_173", "src_text": "We're not aware of a larger-scale public data set for the task, so we collect one using crowd annotation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "einen öffentlichen Datensatz a large-scale public dataset für die Aufgabe gibt, also sammeln wir einen mithilfe der Crowdannotation. Unsere", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_203.wav", "doc_id": "SLpqvupgvW.seg_203", "src_text": "Here are some examples from our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier sind einige Beispiele aus unserem Datensatz.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_141.wav", "doc_id": "wLqFAuDnKa.seg_141", "src_text": "It's crucial for zero and one-shot prompting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Es ist entscheidend für Null und einen", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_423.wav", "doc_id": "WBLMIsdIrq.seg_423", "src_text": "This again demonstrates that it is difficult to determine the best document-level translation system if we use corpus-level metrics alone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies zeigt erneut, dass es schwierig ist, das beste Dokumenten-Niveau-Übersetzungs-System zu bestimmen, wenn wir nur Korpus-Niveau-Metriken verwenden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_567.wav", "doc_id": "rISrKoXQCx.seg_567", "src_text": "We separately pretrain language models on the two different temporal corpora.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "nach dem Vierzigsten Präsidenten und dem Fünfzigsten Präsidenten der Vereinigten Staaten.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_213.wav", "doc_id": "SLpqvupgvW.seg_213", "src_text": "Here is a link to our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist ein Link zu unserem Datensatz,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_158.wav", "doc_id": "wLqFAuDnKa.seg_158", "src_text": "Thank you very much.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_822.wav", "doc_id": "WTTtiRKFZI.seg_822", "src_text": "What we see here is that when the governor is on the left, the tendency for the left conjunct to be shorter grows steadily, with the absolute difference in words, and the same is observed when there is no governor as in coordination of sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Was wir sagen, ist, dass das so ist, wenn der Regierungschef auf der linken Seite ist. Die Tendenz, dass der linke Konjunktur kürzer ist, wächst stetig mit dem absoluten Unterschied in den Worten, und das gleiche wird beobachtet, wenn es keinen Gouverneur gibt, der die Sätze koordiniert,", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_601.wav", "doc_id": "oeooqChmKK.seg_601", "src_text": "Servin is a judge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Serwin ist ein", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_30.wav", "doc_id": "aQpIWggfCo.seg_30", "src_text": "Since large language models are costly to deploy, it's essential to enable language planning ability of smaller and specialized models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Da große Sprachmodelle teuer zu deployen sind, ist es wichtig, Sprachplanung mit etwas kleineren und spezialisierten Modellen zu ermöglichen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_41.wav", "doc_id": "aQpIWggfCo.seg_41", "src_text": "In summary, we establish the constrained language planning problem.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zusammenfassend: Wir stellen das Problem der konstrizierten Sprachplanung fest;", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_598.wav", "doc_id": "oeooqChmKK.seg_598", "src_text": "We introduce a coreference resolution task, designed to probe for the ability to draw on knowledge available in different sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir führen eine Korreferenzlösungsaufgabe ein, die darauf ausgelegt ist, die Fähigkeit zu testen, auf Wissen aus verschiedenen Quellen zu ziehen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_697.wav", "doc_id": "oaOHnMCwad.seg_697", "src_text": "We then take the annotations by demographic and compare them to the models and datasets using a Pearson's R correlation score, and thus our framework actually differs from annotator disagreement literature by comparing end users with models and datasets, predictions and labels, as opposed to looking at just annotator agreement or modelling annotator distributions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir nehmen dann die demographischen Annotationen und vergleichen sie mit den Modellen und Datensätzen, indem wir die Korrelationskennzahlen verwenden. Daher unterscheidet sich unser Framework von der Annotatoren-Disagreement-Literatur, indem wir Endnutzer mit Modellen und Datensätzen, Vorhersagen und Etiketten vergleichen, anstatt nur eine Annotatoren-Übereinstimmung oder Modellierung von Annotatoren-Verteilungen zu betrachten.", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_465.wav", "doc_id": "SUkmfOTvGi.seg_465", "src_text": "Let's get started.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "2023? Lassen Sie", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_805.wav", "doc_id": "WTTtiRKFZI.seg_805", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ordnung, aber", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_356.wav", "doc_id": "gGbuDbHhyc.seg_356", "src_text": "Second, if clean data is required, or if clean data is mandatory for WSL to work, then how many clean samples do we need?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "saubere Daten erforderlich sind oder wenn saubere Daten für die Arbeit von WSL obligatorisch sind, wie viele saubere Stichproben benötigen wir dann?", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_271.wav", "doc_id": "PIZEXUFLAR.seg_271", "src_text": "Therefore, this motivates us to build a multi-modal instruction tuning dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Instruction-Set, daher motiviert es uns, ein Multimodal Instruction-Set zu erstellen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_540.wav", "doc_id": "dvGkKzmIaN.seg_540", "src_text": "We also validate the covertness of the provided embedding by visualising the embedding of sentences on four dataset [INAUDIBLE 4:39] PCA.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben auch die Vertraulichkeit der bereitgestellten Embeddings durch die Visualisierung der Embeddings von Sätzen gefordert, z.B.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_332.wav", "doc_id": "dJGfOSFgZO.seg_332", "src_text": "These reliable, informative, and distinct ABC-Eval metrics enable us to evaluate conversational AI with a higher resolution than previous methods are able to achieve.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese zuverlässigen, aussagekräftigen und klaren A.B.C.-Maße ermöglichen es uns, die Konversation mit einer höheren Auflösung zu bewerten, als es mit den vorherigen Methoden möglich war.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_579.wav", "doc_id": "rISrKoXQCx.seg_579", "src_text": "So a little bit of discussion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die Verwendung von politischen Modellierungen der Sprache entstehen, angehen sollten.", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_604.wav", "doc_id": "oeooqChmKK.seg_604", "src_text": "After a long day at work deciding cases in a law court, he was happy to relax.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "langen Arbeitstag im Park, um sich zu entspannen.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_63.wav", "doc_id": "TVCREhgqUP.seg_63", "src_text": "This can be complicated and sometimes a computationally expensive process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies kann kompliziert sein und manchmal ein computergestütztes Verfahren", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_484.wav", "doc_id": "SUkmfOTvGi.seg_484", "src_text": "To our next question, what causes the performance drop of some models, We had two hypothesis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zur nächsten Frage: Was verursacht den Leistungseinbruch einiger Modelle? Wir hatten zwei Hypothesen:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_323.wav", "doc_id": "dJGfOSFgZO.seg_323", "src_text": "To determine what kind of evaluation is most effective, we selected four state-of-the-art chat models and evaluated them on 100 human-bot conversations per model using ABC-Eval.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Um herauszufinden, was für eine Art von Bewertung die effektivste ist, haben wir vier Chat-Modelle ausgewählt und sie anhand von Hunderten von menschlichen Gesprächen pro Modell bewertet.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_478.wav", "doc_id": "SUkmfOTvGi.seg_478", "src_text": "The first one is the model architecture.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die erste ist die Modellarchitektur.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_661.wav", "doc_id": "FLkGnzVRew.seg_661", "src_text": "Next, to improve the number of dissonance examples, we use a Probability-of-Rare-Class strategy — PRC — to select mostly the examples that are highly likely to be descended by the current model at any round of rare.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "nächstes verbessern wir die Anzahl der Dissimilitud-Beispiele, indem wir die Wahrscheinlichkeit einer seltenen Klasse-Strategie PRC verwenden, um die Beispiele zu wählen, die am wahrscheinlichsten von dem aktuellen Modell in jeder Runde von ALE abweichen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_553.wav", "doc_id": "rISrKoXQCx.seg_553", "src_text": "On the other hand, these different political opinions are inherently socially biased and might lead to potential fairness issues in downstream task applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und auf der anderen Seite sind diese unterschiedlichen politischen Meinungen sozial voreingenommen und möglicherweise zu potenziellen Fairnessproblemen in downstream-Anwendungen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_701.wav", "doc_id": "oaOHnMCwad.seg_701", "src_text": "We host 2 tasks on lab in the wild, one of them being social acceptability, and the way this works is that participants will read a situation from the social chemistry dataset and, then they'll write how socially acceptable a situation is.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Leben im Freien, einer davon ist die soziale Akzeptanz, und die Art und Weise, wie dies funktioniert, ist, dass die Teilnehmer eine Situation aus dem Social Chemistry-Datensatz lesen und dann beurteilen, wie sozial akzeptabel eine Situation ist.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_152.wav", "doc_id": "wLqFAuDnKa.seg_152", "src_text": "The insights that we gained from the human evaluation that we performed using the MQM framework said that the fluency of PaLM is comparable to state-of-the-art systems but the main difference comes from the accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Erkenntnisse, die wir aus der menschlichen Bewertung gewinnen, die wir mit dem MKM-Framework durchführen, sind, dass die Flüssigkeit von Palms mit dem Stand der Kunstsysteme vergleichbar ist, aber der Hauptunterschied kommt von der Genauigkeit, insbesondere.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_368.wav", "doc_id": "gGbuDbHhyc.seg_368", "src_text": "Finally, the performance improvement claimed in previous WSL approaches can be easily achieved by allowing to continue fine-tuning on the clean validation samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Schließlich kann die Leistungsverbesserung, die bei früheren WSL-Ansätzen behauptet wurde, leicht erzielt werden, indem man das Feintuning an sauberen Validierungsbeispielen fortsetzt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_857.wav", "doc_id": "GvEBWkLmuI.seg_857", "src_text": "So, while the generated personas have much higher rates of the lexicon words, the human-written ones have a much wider distribution of words, while the stereotype words that are in the generated personas are really just the words \"tall\" and \"athletic\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "haben also viel höhere Raten der Luxon-Wörter, die humanen haben eine viel breitere Verteilung der Wörter, während die stereotypen Wörter in den generierten Personen wirklich nur die Wörter sind. Es sind", "score": 24.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_11.wav", "doc_id": "aQpIWggfCo.seg_11", "src_text": "Since no dataset of specific goals exists to support our study, we have to acquire these goals first.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "gibt keine Daten außerhalb von spezifischen Zielen, um unsere Studienzeit zu verlängern. Wir müssen diese Ziele zuerst erwerben.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_184.wav", "doc_id": "SLpqvupgvW.seg_184", "src_text": "The second one, which is the alternative question is generated as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die zweite, die alternative Frage, wird wie folgt generiert.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_858.wav", "doc_id": "GvEBWkLmuI.seg_858", "src_text": "So, really just only the positive or at least non-negative ones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Also wirklich nur die positiven oder zumindest keine negativen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_734.wav", "doc_id": "XejEJmgUmE.seg_734", "src_text": "Which can also include grammaticality like BLiMP, SyntaxGym, or acceptability in terms of stereotypes such as CrowS pairs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "auch Grammatikalität wie „blimp“ oder Akzeptanz in Bezug auf Stereotypen wie „Cousins“ beinhalten können.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_596.wav", "doc_id": "oeooqChmKK.seg_596", "src_text": "Therefore, successful models for knowledge-intensive NLU tasks require the ability to integrate and use both pretrain-time and inference-time knowledge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "erfordern erfolgreiche Modelle für kognitiv intensive NLU-Aufgaben die Fähigkeit, Vorwissenszeit und Inferenzzeit zu integrieren und zu nutzen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo,", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_815.wav", "doc_id": "WTTtiRKFZI.seg_815", "src_text": "So the governor is on the left in this example \"I saw Bart and Lisa\" so is the governor is on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "auf der linken Seite, und der Gouverneur ist auf der linken Seite.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_349.wav", "doc_id": "gGbuDbHhyc.seg_349", "src_text": "In weakly supervised learning, training algorithms are proposed to robustly train neural networks under such label noise so that the trained models still generalize well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "generalisieren. Bei der schwachen Beobachtungstraining werden Trainingsalgorithmen vorgeschlagen, um Nervennetze unter solchen Etiketten „Noise“ robust zu trainieren, so dass die Trainingsmodelle immer noch stark generalisiert werden.", "score": 49.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_601.wav", "doc_id": "oeooqChmKK.seg_601", "src_text": "Servin is a judge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Serwin ist Richter,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_384.wav", "doc_id": "WBLMIsdIrq.seg_384", "src_text": "A Data-driven, Multilingual Exploration\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "durch Sprachverarbeitung bezieht.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_376.wav", "doc_id": "gGbuDbHhyc.seg_376", "src_text": "For example, report if the model selection is done via clean validation samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ob die Modellauswahl mit sauberen Validierungsmustern durchgeführt wird.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_449.wav", "doc_id": "hgIDlKNiFM.seg_449", "src_text": "Another also based on CamemBERT, but trained this time on the 4 GB of clinical notes and finally, one based on English biomedical model PubMedBERT, and trained on 4 GB of set of NACHOS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ebenfalls auf Camembert, aber trainierte diesmal auf vier Kilogramm von Kinkanlot. Und schließlich haben wir ein Modell auf der Grundlage eines englischen biomedizinischen Modells, Bumet, und trainieren es auf vier Gigabyte von Naturstoffen, insgesamt haben", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_596.wav", "doc_id": "oeooqChmKK.seg_596", "src_text": "Therefore, successful models for knowledge-intensive NLU tasks require the ability to integrate and use both pretrain-time and inference-time knowledge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "erfordern erfolgreiche Modelle für wissensintensive NLU-Aufgaben die Fähigkeit, sowohl Vortrainingszeit als auch Inferenzzeit zu integrieren und zu verwenden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_57.wav", "doc_id": "TVCREhgqUP.seg_57", "src_text": "In this example, the model has seen shallow recursion during training and is tested on an example with deeper recursion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In diesem Beispiel hat das Modell eine flache Rückwärtsbewegung während des Trainings und wird auf einem Beispiel mit tiefer Rückwärtsbewegung getestet.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_209.wav", "doc_id": "SLpqvupgvW.seg_209", "src_text": "If the language model has access to some partially overlapping background knowledge, then the accuracy is between 82 to 87%, which is more realistic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn das Sprachmodell auf einige teilweise überlappende Hintergrundkenntnisse zugreifen kann, liegt die Genauigkeit zwischen achtundachtzig und neunundsiebzig Prozent, was beispielsweise dann", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_174.wav", "doc_id": "SLpqvupgvW.seg_174", "src_text": "Our data set covers three different domains: music, books, and recipes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Datensätze umfassen drei verschiedene Domänen: Musik, Bücher und Rezepte.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_367.wav", "doc_id": "gGbuDbHhyc.seg_367", "src_text": "As we can see, if we have 10 samples per class, direct fine-tuning starts to beat WSL approaches.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wie wir sehen können, wenn wir zehn Proben pro Klasse haben, beginnt die Direkt-Fine-Tuning zu WSL-Ansätzen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_51.wav", "doc_id": "TVCREhgqUP.seg_51", "src_text": "In the context of semantic parsing, testing for compositional generalization might look like this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Im Kontext der semantischen Analyse könnte das Testen für die Kompositionsgeneralisierung wie folgt aussehen:", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_344.wav", "doc_id": "gGbuDbHhyc.seg_344", "src_text": "I'd like to begin with a brief introduction to weak supervision and weakly supervised learning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ich würde gerne mit einer kurzen Einführung zum wöchentlichen Überwachung und wöchentlichen Überwachungsunterricht beginnen.", "score": 19.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_62.wav", "doc_id": "TVCREhgqUP.seg_62", "src_text": "This works well, but trees are usually not given and need to be obtained somehow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das funktioniert auch, aber es ist normalerweise nicht möglich, etwas davon zu erhalten.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_397.wav", "doc_id": "WBLMIsdIrq.seg_397", "src_text": "To answer the first question, we started by measuring how much a word depends on context during translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um die erste Frage zu beantworten, begannen wir damit, zu messen, wie viel ein Wort bei der Übersetzung von Kontext abhängt.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_441.wav", "doc_id": "hgIDlKNiFM.seg_441", "src_text": "However, French didn't have any open source model for biomedical until now.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "neues Open-Source-Modell für BioMedicine. Wir stellen uns also die Frage,", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_338.wav", "doc_id": "dJGfOSFgZO.seg_338", "src_text": "We hope ABC-Eval can be leveraged by others in the field as a meaningful step in this direction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir hoffen, dass A.B.C.Eval von anderen in diesem Bereich als bedeutender Schritt in diese Richtung angesehen wird, und", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_463.wav", "doc_id": "SUkmfOTvGi.seg_463", "src_text": "Hello everyone, my name is Shuheng.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hallo alle, ich heiße Suhun.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_839.wav", "doc_id": "GvEBWkLmuI.seg_839", "src_text": "Immediately we see that, while the outputs aren't overtly negative or toxic in the traditional sense of these words, there are some interesting patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sofort sehen wir, dass die Ausgaben nicht offensichtlich negativ oder giftig sind, im traditionellen Sinne dieser Wörter. Es gibt einige interessante Muster:", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_766.wav", "doc_id": "XejEJmgUmE.seg_766", "src_text": "That is, when we perturb the sentences in the acceptable domain, we see similar increase in all the perturbations and when we perturb the sentences in the unacceptable domain, we see decrease in MPP judgments in similar fashion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn wir die Sätze im akzeptablen Bereich stören, sehen wir einen ähnlichen Anstieg aller Störungen, und wenn wir die Sätze im unakzeptablen Bereich stören, sehen wir einen ähnlichen Rückgang der MP-P-Judikate. So", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_319.wav", "doc_id": "dJGfOSFgZO.seg_319", "src_text": "We call this approach annotating behaviors in chat or ABC-Eval in short.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir nennen diese Herangehensweise Annotieren von Verhaltensweisen im Chat oder ABC für Kurzform.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_658.wav", "doc_id": "FLkGnzVRew.seg_658", "src_text": "Next, we determine the best method to update a model with new data from each round of active learning and annotations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Als nächstes bestimmen wir die beste Methode zur Aktualisierung eines Modells mit neuen Daten aus jeder Runde der aktiven Lern- und Annotierungen: kumulative Akkumulationen aller", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_499.wav", "doc_id": "SUkmfOTvGi.seg_499", "src_text": "Thank you so much.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_522.wav", "doc_id": "dvGkKzmIaN.seg_522", "src_text": "Before these main steps, we first select a trigger set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hauptschritte ausführen, wählen wir zunächst einen Auslöser. Der", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_51.wav", "doc_id": "TVCREhgqUP.seg_51", "src_text": "In the context of semantic parsing, testing for compositional generalization might look like this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Im Kontext des semantischen Parsings, bei dem man für die generelle Zusammensetzung testet, sieht", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_686.wav", "doc_id": "oaOHnMCwad.seg_686", "src_text": "And as a researcher, positionality can influence the research process and its outcomes and results because it can change the decisions that researchers make.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "als Forscher kann die Positionalität den Forschungsprozess und seine Ergebnisse und Ergebnisse beeinflussen, weil sie die Entscheidungen, die Forscher treffen, verändern kann.", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_587.wav", "doc_id": "rISrKoXQCx.seg_587", "src_text": "I think that's pretty much all I have for today.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ich glaube, das ist sehr viel, ich bin gestorben, ich bin fünfmal für heute gestorben,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_559.wav", "doc_id": "rISrKoXQCx.seg_559", "src_text": "They occupy all four quadrants on the political campus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Man kann", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_627.wav", "doc_id": "oeooqChmKK.seg_627", "src_text": "To summarize the main takeaways of our paper, many coreference resolution models appear unable to reason over knowledge from different sources without task-specific training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um die Hauptaspekte unseres Papiers zusammenzufassen: Viele Referenzmodelle scheinen nicht in der Lage zu sein, Wissen aus verschiedenen Quellen ohne taskspezifische Schulung zu verarbeiten.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_71.wav", "doc_id": "TVCREhgqUP.seg_71", "src_text": "That's why in the second step we use another model to predict a permutation to put them into the right order.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Deshalb verwenden wir in der zweiten Phase ein anderes Modell, um eine Permutation vorherzusagen, um sie in die richtige Reihenfolge zu bringen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_661.wav", "doc_id": "FLkGnzVRew.seg_661", "src_text": "Next, to improve the number of dissonance examples, we use a Probability-of-Rare-Class strategy — PRC — to select mostly the examples that are highly likely to be descended by the current model at any round of rare.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Um die Anzahl der Beispiele zu erhöhen, wählen wir die Beispiele aus, die am ehesten durch das aktuelle Modell unterschieden werden können.", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_790.wav", "doc_id": "WTTtiRKFZI.seg_790", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "weil hier", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_123.wav", "doc_id": "wLqFAuDnKa.seg_123", "src_text": "This is joint work with my colleagues from Google Translate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies ist eine gemeinsame Arbeit mit meinen Kolleginnen und Kollegen von Google Translate.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_418.wav", "doc_id": "WBLMIsdIrq.seg_418", "src_text": "We then use the MuDA tagger, by applying the tagger on a parallel corpus that we want to use for evaluation and we apply our translation metrics of choice on the context-dependent examples that the MuDA tagger has identified.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dann verwenden wir den Mudatagger, indem wir den Tagger auf einem parallelen Korpus anwenden, den wir für die Bewertung verwenden möchten, und wir wenden unsere Übersetzungsmaße der Wahl auf die kontextabhängigen Beispiele an, die der Mudatagger identifiziert hat.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_686.wav", "doc_id": "oaOHnMCwad.seg_686", "src_text": "And as a researcher, positionality can influence the research process and its outcomes and results because it can change the decisions that researchers make.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und als Forscher kann die Positionierung den Forschungsprozess und seine Ergebnisse beeinflussen, weil sie die Entscheidungen der Forscher ändern kann.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_356.wav", "doc_id": "gGbuDbHhyc.seg_356", "src_text": "Second, if clean data is required, or if clean data is mandatory for WSL to work, then how many clean samples do we need?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "saubere Daten erforderlich sind oder wenn saubere Daten für die Arbeit von WSL obligatorisch sind, wie viele saubere Beispiele benötigen wir dann?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_158.wav", "doc_id": "wLqFAuDnKa.seg_158", "src_text": "Thank you very much.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_137.wav", "doc_id": "wLqFAuDnKa.seg_137", "src_text": "So, it's important to select a good prompting strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "erreichen, also ist es wichtig, eine gute Promotionsstrategie auszuwählen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_685.wav", "doc_id": "oaOHnMCwad.seg_685", "src_text": "This is a concept widely used in critical studies, specifically in feminist and queer academic spaces.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies ist ein weit verbreitetes Konzept in kritischen Studien, insbesondere in feministischen und queer akademischen Räumen,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_234.wav", "doc_id": "oYCKgTzTDy.seg_234", "src_text": "We also test Monolingual Few-shot setting by training monolingual models with only 10% of training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Außerdem testen wir die Einstellungen für die Monolinguale, indem wir Monolinguale Modelle mit nur dreizehn Prozent der Trainingsdaten trainieren.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_377.wav", "doc_id": "gGbuDbHhyc.seg_377", "src_text": "Second, WSL approaches should be compared with few-shot learning baselines, as both work on clean samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zweitens sollten WSAL-Ansätze mit zukünftigen Lernbasen verglichen werden, eine vorgesehene Arbeit an klaren Mustern;", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_766.wav", "doc_id": "XejEJmgUmE.seg_766", "src_text": "That is, when we perturb the sentences in the acceptable domain, we see similar increase in all the perturbations and when we perturb the sentences in the unacceptable domain, we see decrease in MPP judgments in similar fashion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sind. Das heißt, wenn wir die Sätze in der akzeptablen Domäne stören, sehen wir einen ähnlichen Anstieg aller Störungen und wenn wir die Sätze in der nicht akzeptablen Domäne stören, sehen wir einen ähnlichen Rückgang der MP-Richtlinien. Der", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_156.wav", "doc_id": "wLqFAuDnKa.seg_156", "src_text": "And that's it for this really short overview.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und das ist es, was wir für diese wirklich schockierende Überprüfung", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_477.wav", "doc_id": "SUkmfOTvGi.seg_477", "src_text": "Throughout experiments we found that there are three main ingredients that are needed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "unsere Experimente haben wir festgestellt, dass es drei Hauptbestandteile gibt, die benötigt werden.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_381.wav", "doc_id": "gGbuDbHhyc.seg_381", "src_text": "Please feel free to check it out.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "finden. Bitte fühlen Sie sich frei, ihn", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_315.wav", "doc_id": "dJGfOSFgZO.seg_315", "src_text": "Therefore, you might want to evaluate multiple dimensions of chat quality to understand the strengths and weaknesses of the model on a finer-grained level.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "daher möchte ich, dass Sie mehrere Dimensionen der Dialogqualität bewerten, um die Stärken und Schwächen des Modells zu verstehen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_400.wav", "doc_id": "WBLMIsdIrq.seg_400", "src_text": "In this work, we extend CXMI to Pointwise CXMI which can measure context usage at the sentence level or at the word level.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In dieser Arbeit erweitern wir CxMI zu point-wise CxMI, mit dem man Kontextnutzung am Satzlevel messen kann. Level, oder auf Wortebene:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_743.wav", "doc_id": "XejEJmgUmE.seg_743", "src_text": "So for example, here we have chosen like a typical pair of grammaticality from the BLiMP data set from the Adjunct Island case.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher haben wir beispielsweise hier wie eine typische Paarung von Grammatikalität aus dem Datenbestand von der Adjuvant Island-Kase ausgewählt.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_373.wav", "doc_id": "gGbuDbHhyc.seg_373", "src_text": "Their performance gain and practicality are heavily overestimated.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ihr Leistungsgewinn und ihre Praktikabilität werden stark überschätzt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_855.wav", "doc_id": "GvEBWkLmuI.seg_855", "src_text": "So first we use a lexicon of stereotypes, and we find that the generated personas contain a lot more stereotypes than the human-written ones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die wir verwenden, und wir stellen fest, dass die Stereotypen, die wir verwenden, viel mehr Stereotypen enthalten als die Stereotypen, die", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_766.wav", "doc_id": "XejEJmgUmE.seg_766", "src_text": "That is, when we perturb the sentences in the acceptable domain, we see similar increase in all the perturbations and when we perturb the sentences in the unacceptable domain, we see decrease in MPP judgments in similar fashion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "reagieren. Das bedeutet, wenn wir die Sätze in der akzeptablen Domäne stören, sehen wir eine ähnliche Zunahme der Störungen, und wenn wir die Sätze in der unakzeptablen Domäne stören, sehen wir eine ähnliche Abnahme der Urteile.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_481.wav", "doc_id": "SUkmfOTvGi.seg_481", "src_text": "We found that usually larger models lead to better generalization.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben festgestellt, dass größere Modelle zu einer besseren Generalisierung führen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_640.wav", "doc_id": "FLkGnzVRew.seg_640", "src_text": "So why does this matter?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Warum ist das wichtig?", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_694.wav", "doc_id": "oaOHnMCwad.seg_694", "src_text": "The first step is to re annotate data sets with diverse annotators.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der erste Schritt besteht darin, Datensätze mit verschiedenen Annotatoren neu zu annotieren.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_850.wav", "doc_id": "GvEBWkLmuI.seg_850", "src_text": "So when people are describing a warrior who is a woman, they'll usually actually specify \"woman warrior\" and mark the term with \"woman\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "einen Krieger beschreibt, der normalerweise eine Frau ist, dann ist das normalerweise eine Frau. Und mehr noch, die dominierenden", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_106.wav", "doc_id": "uZBWfYjYnf.seg_106", "src_text": "A word is emitted if the attention is not concentrated, that is, its sum is below a certain threshold alpha towards the last lambda speech frames, meaning that the received information is enough stable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ein Wort wird ausgesprochen, wenn die Spannung nicht konzentriert ist, d. h. wenn ihr Wert unter einem bestimmten Schwellenwert Alpha liegt, was bedeutet, dass die erhaltenen Informationen nicht stabil sind.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_646.wav", "doc_id": "FLkGnzVRew.seg_646", "src_text": "We used dissonance-first approach, as seen in the flow chart here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir verwendeten den ersten Ansatz zur Diskrepanz, wie er hier im Flussdiagramm dargestellt", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_791.wav", "doc_id": "WTTtiRKFZI.seg_791", "src_text": "Because here between the verb and the direct object is an adjunct: \"yesterday\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "zwischen dem Verb und dem direkten Objekt gestern Abend noch etwas hinzugekommen ist.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_834.wav", "doc_id": "GvEBWkLmuI.seg_834", "src_text": "To overcome these limitations, we rely on the property that these newer instruction-tuned LLMs are very good at responding to instructions and prompts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese Einschränkungen zu überschreiten, sind wir auf die Eigenschaft angewiesen, dass diese neuen Anweisungen sehr gut auf Anweisungen antworten. So kann", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_624.wav", "doc_id": "oeooqChmKK.seg_624", "src_text": "When trained on KITMUS, however, both C2F and BERT4Coref perform significantly better than the random choice.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn wir jedoch auf Kidmus ausgebildet werden, funktionieren sowohl Sea to Earth als auch Bert Forquerth deutlich besser als die Durandal-Option.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_263.wav", "doc_id": "PIZEXUFLAR.seg_263", "src_text": "Hello everyone, my name is Ying and my colleague Zhiyang and I will be presenting our research on MultiInstruct improving Multi-Modal Zero-Shot Learning via Instruction Tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo alle, mein Name ist Ying und mein Kollege Jian und ich werden unsere Forschung über Multi-Instruct, Verbesserung des multimodalen sozialen Lernens durch Instruktion, vorstellen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_625.wav", "doc_id": "oeooqChmKK.seg_625", "src_text": "This suggests that when trained on generic reference resolution data sets, most learn to exploit surface cues, which are not useful when testing on KITMUS where such queues have been removed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies lässt darauf schließen, dass Mäuse, wenn sie auf allgemeine Quervergleichsdatensätze trainiert werden, Oberflächenmerkmale ausnutzen, die bei der Überprüfung in einem Käfig nicht nützlich sind.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_152.wav", "doc_id": "wLqFAuDnKa.seg_152", "src_text": "The insights that we gained from the human evaluation that we performed using the MQM framework said that the fluency of PaLM is comparable to state-of-the-art systems but the main difference comes from the accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Erkenntnisse, die wir aus der menschlichen Analyse gewinnen, und die wir mit dem MQR-Verfahren durchführen, ist, dass die Fließfähigkeit von Palmen mit dem Zustand der Kunstsysteme vergleichbar ist, aber der Hauptunterschied kommt aus der Genauigkeit. Insbesondere,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_497.wav", "doc_id": "SUkmfOTvGi.seg_497", "src_text": "We hope our paper calls for more research on how to improve generalizations of the models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir hoffen, dass unsere Arbeit mehr Forschung auf den Weg bringt, wie man die Modelle verbessern kann.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_450.wav", "doc_id": "hgIDlKNiFM.seg_450", "src_text": "In total, we have seven models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Insgesamt haben wir sieben Modelle.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_266.wav", "doc_id": "PIZEXUFLAR.seg_266", "src_text": "However, most previous works on instruction tuning focused on improving the zero-shot performance on language only tasks, while computer vision and multi-modal tasks have been left out.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die meisten früheren Arbeiten zum Anpassen von Anweisungen konzentrierten sich jedoch auf die Verbesserung der sekundären Leistung bei Sprachaufgaben, wobei Computerseh- und Multimodaltasks ausgelassen wurden.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_107.wav", "doc_id": "uZBWfYjYnf.seg_107", "src_text": "For example, if we receive a speech chunk containing \"I'm going to talk about...\" and our model predicts the translation in German, and we will look at the cross-attention weights, we'll see that the first two words points to the earliest received speech frames, while the last word points to the last received speech frames, as lambda speech frames.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel, wenn wir einen Textschnipsel erhalten, der „Ich werde darüber sprechen“ enthält, und unser Modell die Übersetzung ins Deutsche vorhersagt. Und wir werden auf die Kreuzbelastung achten. Wir werden sehen, dass die ersten beiden Wörter auf die frühesten empfangenen Sprachrahmen verweisen, während das letzte Wort auf die zuletzt empfangenen Sprachrahmen, die sogenannten „Lamda“-Sprachrahmen, verweist.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_349.wav", "doc_id": "gGbuDbHhyc.seg_349", "src_text": "In weakly supervised learning, training algorithms are proposed to robustly train neural networks under such label noise so that the trained models still generalize well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "in WSL Weekly Superwise Learning ist", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_414.wav", "doc_id": "WBLMIsdIrq.seg_414", "src_text": "So now we use our findings from our analysis to design a benchmark for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "verwenden wir unsere Ergebnisse aus unseren Analysen, um einen Benchmark für die Dokumenten-Novelle zu entwerfen.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_685.wav", "doc_id": "oaOHnMCwad.seg_685", "src_text": "This is a concept widely used in critical studies, specifically in feminist and queer academic spaces.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "häufig verwendet wird, insbesondere in feministischen und queer akademischen Räumen.", "score": 3.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_84.wav", "doc_id": "TVCREhgqUP.seg_84", "src_text": "First of all, the alignment between input and output is not given in the training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "vor. Zunächst einmal ist die Ausrichtung zwischen Input und Output in den Trainingsdaten nicht angegeben.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_617.wav", "doc_id": "oeooqChmKK.seg_617", "src_text": "Here's an example of how we control the availability of facts in the true sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier ist ein Beispiel dafür, wie wir die Verfügbarkeit von Fakten aus wahren Quellen kontrollieren.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_113.wav", "doc_id": "uZBWfYjYnf.seg_113", "src_text": "But also we want that they are shifted on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "diesem Plot ist. Aber wir wollen auch, dass sie nach links verschoben", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_791.wav", "doc_id": "WTTtiRKFZI.seg_791", "src_text": "Because here between the verb and the direct object is an adjunct: \"yesterday\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "zwischen dem Verb und dem Objekt ein Abstand ist. Und", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_780.wav", "doc_id": "WTTtiRKFZI.seg_780", "src_text": "The conjunction headed approach assumed in Prague dependency treebanks, where coordinate structures are headed by the conjunction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Prag-Ansatz, dem Konjunktur-Ansatz, der in Prag-Abhängigkeitstrinzenen durchgeführt wird, wobei Koordinatursysteme von der Konjunktur durchgeführt werden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_70.wav", "doc_id": "TVCREhgqUP.seg_70", "src_text": "After the first step, we have all the right tokens, but they're not ordered.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Nach dem ersten Schritt haben wir alle richtigen Tokens, aber sie sind nicht bestellt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_483.wav", "doc_id": "SUkmfOTvGi.seg_483", "src_text": "Here we also found that more fine tuning examples, actually also leads to better generalization.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier haben wir auch festgestellt, dass mehr Feintuning-Beispiele tatsächlich zu einer besseren Generalisierung führen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_442.wav", "doc_id": "hgIDlKNiFM.seg_442", "src_text": "So we ask ourselves a question about what is the most appropriate data sources for a wide range of usage and those crawled data are good substitution for clinical data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "was die geeignetsten Datenquellen für eine Vielzahl von Anwendungen sind, und diese Daten sind gute Ersatzdaten für klinische Daten.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_698.wav", "doc_id": "oaOHnMCwad.seg_698", "src_text": "Our frame is largely enabled through Lab in the Wild and online crowdsourcing platform for where HCI collaborator.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "weitgehend über Lab in the Wild, eine Online-Crowdsourcing-Plattform, nutzbar. In", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_467.wav", "doc_id": "SUkmfOTvGi.seg_467", "src_text": "We observe that models have been used in CoNLL-2003 to develop NER for almost 20 years and this naturally raises several problems.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellen fest, dass Modelle, die Carnell 2003 zur Entwicklung von NER verwendet haben, fast 20 Jahre lang verwendet wurden, und dies wirft natürlich viele Probleme auf:", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_162.wav", "doc_id": "SLpqvupgvW.seg_162", "src_text": "Our goal is to understand users’ language when they want to make a choice.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unser Ziel ist es, die Sprache der Benutzer zu verstehen, wenn sie eine Wahl treffen möchten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_27.wav", "doc_id": "aQpIWggfCo.seg_27", "src_text": "We only keep the script if the target goal scores the highest in the goal set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir behalten nur das Skript bei, wenn das Zielziel der höchsten Bewertung im Zielziel erreicht.", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_479.wav", "doc_id": "SUkmfOTvGi.seg_479", "src_text": "Through our experiments we found that the transformer models normally generalize better to new data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Durch unsere Experimente haben wir festgestellt, dass sich die Transformatormodelle normalerweise besser an neue Daten anpassen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_569.wav", "doc_id": "rISrKoXQCx.seg_569", "src_text": "So this indicates that language models can also pick up the polarisation in our society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "bewegt, nach 2017, was darauf hindeutet, dass Sprachmodelle auch die Polarisierung in unserer Gesellschaft aufgreifen können.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_450.wav", "doc_id": "hgIDlKNiFM.seg_450", "src_text": "In total, we have seven models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wir sieben Modelle.", "score": 31.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_811.wav", "doc_id": "WTTtiRKFZI.seg_811", "src_text": "So when the difference between the lengths of the two conjuncts grows, the shorter conjunct prefers to be the first one, stronger, right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Längendifferenzs wächst, also wenn die Länge des Längendifferenzs wächst, bevorzugt der kürzere Konjunkt zuerst der stärkere ist, also", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_444.wav", "doc_id": "hgIDlKNiFM.seg_444", "src_text": "Afterwards, we ask ourselves how much data do we need to train a specialized model on French data?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wie viel Daten benötigen wir, um ein spezialisiertes Modell auf französischen Daten zu trainieren?", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_681.wav", "doc_id": "oaOHnMCwad.seg_681", "src_text": "Where prospective AP is really not as sensitive to offensive terms that are more common in Indian contexts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Perspektive der AP wirklich nicht auf offensichtliche Begriffe in indischen Kontexten gerichtet ist.", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_17.wav", "doc_id": "aQpIWggfCo.seg_17", "src_text": "Results in the figure show that the semantic completeness in generated scripts is acceptable but the faithfulness to the constraints cannot be guaranteed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "in der Grafik zeigen, dass die semantische Vollständigkeit in generierten Skripten akzeptabel ist, aber die Treue zu den Einschränkungen kann nicht garantiert werden.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_233.wav", "doc_id": "oYCKgTzTDy.seg_233", "src_text": "In this setting, the source language is the same as target language, for example German to German or English to English.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Einstellung ist die Quellensprache dieselbe wie die Zielsprache, zum Beispiel Deutsch zu Deutsch oder Englisch zu Englisch.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_854.wav", "doc_id": "GvEBWkLmuI.seg_854", "src_text": "Now for some results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Für Ergebnisse, also", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_627.wav", "doc_id": "oeooqChmKK.seg_627", "src_text": "To summarize the main takeaways of our paper, many coreference resolution models appear unable to reason over knowledge from different sources without task-specific training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um die wichtigsten Schlussfolgerungen des Berichts zusammenzufassen, Viele Koherenzmodellierungsmodelle scheinen ohne task-spezifische Ausbildung nicht in der Lage zu sein, Überwissens aus verschiedenen Quellen zu begründen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_206.wav", "doc_id": "SLpqvupgvW.seg_206", "src_text": "Results with T5 XL model are summarized below.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Ergebnisse mit dem großen Modell T5-XL werden unten zusammengefasst.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_737.wav", "doc_id": "XejEJmgUmE.seg_737", "src_text": "The current MPP pipeline basically doesn't allow us to evaluate a model's acceptance towards longer sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "für MPP-Modelle erlaubt uns eigentlich nicht, die Akzeptanz eines Modells gegenüber längeren Sätzen zu bewerten.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_879.wav", "doc_id": "GvEBWkLmuI.seg_879", "src_text": "Have a good time at ACL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zuhören, ich hatte eine gute Zeit.", "score": 24.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_544.wav", "doc_id": "dvGkKzmIaN.seg_544", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir kommen,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_648.wav", "doc_id": "FLkGnzVRew.seg_648", "src_text": "As can be seen here, dissonance was only found in 3.5% of the annotated pairs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wie man hier sehen kann, war die Diskrepanz nur in fünf Prozent der annotierten Paare zu", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_125.wav", "doc_id": "wLqFAuDnKa.seg_125", "src_text": "It's trained on a large collection of text, comprising 780 billion tokens.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Es basiert auf einer großen Textkollektion, die 780 Milliarden Dokumente", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_485.wav", "doc_id": "SUkmfOTvGi.seg_485", "src_text": "The first one is adaptive overfitting, which is overfitting costs by reusing the same test set over and over again and this is usually manifested as the diminishing returns on a new test set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die erste ist die adaptive Überanpassung, die durch die wiederholte Verwendung desselben Tests verursacht wird, und dies zeigt sich normalerweise, wenn die Abnahme auf dem neuen Test zurückkehrt.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_141.wav", "doc_id": "wLqFAuDnKa.seg_141", "src_text": "It's crucial for zero and one-shot prompting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist entscheidend für Null- und eine Anregung, aber", "score": 9.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_204.wav", "doc_id": "SLpqvupgvW.seg_204", "src_text": "For example, \"the one without words\", \"not the one with the 12 year old boy\", or \"the fictional one\", or \"comes from Azerbaijan\", and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "zum Beispiel der ohne Worte, nicht der mit dem zwölfjährigen Jungen, oder der fiktionale oder aus Aserbaidschan. Der Alternativenkorpus hat", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_272.wav", "doc_id": "PIZEXUFLAR.seg_272", "src_text": "Here we present MultiInstruct, the first multi-modal instruction tuning benchmark dataset that consists of 62 diverse multi-modal tasks covering 10 broad categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier stellen wir MultiInstruct vor, das erste multimodale Benchmark-Datensatz, der aus 62 verschiedenen multimodalen Aufgaben besteht, die 10 Kategorien abdecken.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_590.wav", "doc_id": "oeooqChmKK.seg_590", "src_text": "This work is a collaboration between McGill University, Mila, and Microsoft Research.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese Arbeit ist eine Zusammenarbeit zwischen der McGill University, MILA und Microsoft Research.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_153.wav", "doc_id": "wLqFAuDnKa.seg_153", "src_text": "So, in particular, the most common errors are omission errors.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "häufig sind Auslassungsfehler. Es scheint, dass", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_259.wav", "doc_id": "oYCKgTzTDy.seg_259", "src_text": "And our results show many interesting findings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und unsere Ergebnisse zeigen viele interessante Ergebnisse,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_121.wav", "doc_id": "uZBWfYjYnf.seg_121", "src_text": "Thanks for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "vielen Dank für Ihre Aufmerksamkeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_462.wav", "doc_id": "hgIDlKNiFM.seg_462", "src_text": "So thank you for this presentation, and we are looking forward to exchange at the poster session in Toronto.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dank für diese Präsentation, und wir freuen uns darauf, in der Post Office in Toronto zu handeln.", "score": 22.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_706.wav", "doc_id": "oaOHnMCwad.seg_706", "src_text": "Our study in the end amassed over 16,000 annotations from over 1000 annotators from 87 countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und am Ende sammelten über 16.000 Annotierungen von über 1.000 Annotatoren aus 87", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_233.wav", "doc_id": "oYCKgTzTDy.seg_233", "src_text": "In this setting, the source language is the same as target language, for example German to German or English to English.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Sinne ist die Quelle die gleiche wie die Ziel-Sprache, zum Beispiel Deutsch zu Deutsch oder Englisch zu Englisch.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_820.wav", "doc_id": "WTTtiRKFZI.seg_820", "src_text": "So we showed that by measuring length in characters, the first column, in syllables the middle column, and in words the right column.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wir, dass wir, indem wir die Länge von Zeichen messen, dem ersten Wort in Sätzen, dem mittleren Wort in Sätzen und den Worten im Text, dem", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_6.wav", "doc_id": "aQpIWggfCo.seg_6", "src_text": "Planning for the goals with specific constraints, such as \"make a chocolate cake\", still remains under-studied.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Planung für Ziele mit spezifischen Zielen, die spezifischen Einschränkungen wie das Backen eines Schokoladenkuchens unterliegen, ist immer noch nicht ausreichend erforscht.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_524.wav", "doc_id": "dvGkKzmIaN.seg_524", "src_text": "We assume the provider can collect a general text corpus and count the word frequency with it.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir gehen davon aus, dass der Anbieter einen allgemeinen Textkorpus sammeln und die Wortfrequenz zählen kann. Bei der", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_571.wav", "doc_id": "rISrKoXQCx.seg_571", "src_text": "So we see that if we investigate the per category performance, that is to say if we separate the performance into different demographics or political leaning of news media we can see a pattern.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wir die Leistungskategorie untersuchen, das bedeutet, wenn wir die Leistung aufteilen. Verschiedene Demografien oder politische Nachrichtenmedien können ein Muster erkennen,", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_792.wav", "doc_id": "WTTtiRKFZI.seg_792", "src_text": "However, this effect may be ameliorated when the direct object is very heavy and very long.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "das Objekt des Direkts ist ein sehr schweres und sehr langes Objekt, weil es dann", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_76.wav", "doc_id": "TVCREhgqUP.seg_76", "src_text": "For the first output position, we simply select one, as highlighted in red.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Für die erste Ausgangsposition wählen wir einfach eines, das rot markiert ist.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_711.wav", "doc_id": "oaOHnMCwad.seg_711", "src_text": "We find that Dynahate is also most aligned to English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "auch heraus, dass die Datenmodelle für die meisten englischsprachigen Länder am besten geeignet sind.", "score": 7.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_535.wav", "doc_id": "dvGkKzmIaN.seg_535", "src_text": "We compute the similarity difference between benign and backdoor data set which is defined as delta cosine and delta L2.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir berechnen den Ähnlichkeitsunterschied zwischen dem normalen und dem Hintertür-Datensatz, der als Delta-Kosinus und Delta-L-Zwei definiert ist.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_346.wav", "doc_id": "gGbuDbHhyc.seg_346", "src_text": "Instead, we label the data using weak labeling sources, such as simple heuristic rules, knowledge bases, or low-quality crowdsourcing, as illustrated in the figure on the right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "schwache Beschriftungsquellen, wie einfache heuristische Regeln, Wissensbasen oder niedrigwertige Cloud-Quellen, wie es in der Abbildung rechts dargestellt ist.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_439.wav", "doc_id": "hgIDlKNiFM.seg_439", "src_text": "Since then, this model has been adapted to many other languages, like in French with CamemBERT, and also in domains like biomedical with PubMedBERT and BioBERT and on clinical with ClinicalBERT, but mostly in English.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Seitdem wurde dieses Modell auf viele andere Sprachen wie Französisch mit Camembert und andere Domänen wie Biomedizin mit Pametber und Biober übernommen, und auf klinisch mit klinisch übernommen, aber meistens auf Englisch.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_696.wav", "doc_id": "oaOHnMCwad.seg_696", "src_text": "And so we opt to re annotate data to get many annotates for instance and to get a rich set of demographic data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir entscheiden uns also dafür, die Daten zu reannotieren, um viele Anwender beispielsweise zu erhalten und einen reichen Satz an demographischen Daten zu erhalten.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_98.wav", "doc_id": "uZBWfYjYnf.seg_98", "src_text": "And training and maintaining several models to reach different latency regimes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und man trainiert und hält mehrere Modelle, um verschiedene Latenzregime zu ermitteln,", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_143.wav", "doc_id": "wLqFAuDnKa.seg_143", "src_text": "It's the examples that carry most of the weight.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sind die Beispiele, die den größten", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_837.wav", "doc_id": "GvEBWkLmuI.seg_837", "src_text": "And we can immediately see that this is very generalizable to any demographic because we can just specify whatever identity marker that we want into this prompt.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir können sofort sehen, dass das sehr allgemein für jede Demografie ist, weil wir einfach jede Identität angeben können, die wir in diesem Prom haben wollen.", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_146.wav", "doc_id": "wLqFAuDnKa.seg_146", "src_text": "In particular, we compare the selecting prompts from the training data for the WMT evaluations on the dev data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "insbesondere, dass wir die Auswahlanregungen aus den Trainingsdaten der WMT-Evaluierungen oder den Testdaten vergleichen.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_621.wav", "doc_id": "oeooqChmKK.seg_621", "src_text": "We evaluate the data set both with human study participants, and established coreference resolution models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir bewerten sowohl die Datensätze mit den menschlichen Studienteilnehmern als auch die etablierten Lösungsmodelle.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_437.wav", "doc_id": "hgIDlKNiFM.seg_437", "src_text": "And finally, we conclude about the experiments and give you more details about how to access those models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und schließen schließlich über die Experimente ab und geben Ihnen mehr Details darüber, wie Sie die Modelle zugreifen können.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_596.wav", "doc_id": "oeooqChmKK.seg_596", "src_text": "Therefore, successful models for knowledge-intensive NLU tasks require the ability to integrate and use both pretrain-time and inference-time knowledge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "hat. Daher erfordern erfolgreiche Modelle für wissensintensive LU-Aufgaben die Fähigkeit, sowohl vorbereitete Zeit als auch Inferenzzeit zu nutzen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_266.wav", "doc_id": "PIZEXUFLAR.seg_266", "src_text": "However, most previous works on instruction tuning focused on improving the zero-shot performance on language only tasks, while computer vision and multi-modal tasks have been left out.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Allerdings konzentrierten sich die meisten vorherigen Arbeiten zur Anweisungstuning auf die Verbesserung der Null-Schussleistung bei Sprachaufgaben, wobei Computer-Vision- und multimodale Aufgaben außer Acht gelassen wurden.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_170.wav", "doc_id": "SLpqvupgvW.seg_170", "src_text": "Or when the user wants to specify a preference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Oder wenn der Benutzer eine Präferenz angeben möchte,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_831.wav", "doc_id": "GvEBWkLmuI.seg_831", "src_text": "However, these measures have various limitations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Maßnahmen haben jedoch verschiedene Einschränkungen, sie", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_600.wav", "doc_id": "oeooqChmKK.seg_600", "src_text": "Here is an example from our data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ein Beispiel aus unserem Datensatz:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_282.wav", "doc_id": "PIZEXUFLAR.seg_282", "src_text": "We use all the instances in the test split for each task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "verwenden alle Instanzen im Testsplit für", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_45.wav", "doc_id": "aQpIWggfCo.seg_45", "src_text": "Thanks for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank für Ihre Zeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_223.wav", "doc_id": "oYCKgTzTDy.seg_223", "src_text": "The Lambda calculus is missing, or they're only evaluated on certain neural models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der Mond ist sichtbar. Oder sie werden nur anhand bestimmter neuerer Modelle bewertet.", "score": 14.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_196.wav", "doc_id": "SLpqvupgvW.seg_196", "src_text": "So what we do is that we show some background knowledge about the two entities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir zeigen also einige Hintergrundwissen über die beiden Entitäten.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_296.wav", "doc_id": "PIZEXUFLAR.seg_296", "src_text": "Here we can see, as the amount of task increases, the model achieves better performance and in the meantime, lower sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier können wir sehen, dass, wenn die Anzahl der Aufgaben zunimmt, das Modell eine bessere Leistung erreicht und gleichzeitig eine geringere Empfindlichkeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_306.wav", "doc_id": "PIZEXUFLAR.seg_306", "src_text": "So this is a QR code for our data and model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist dies ein QR-Code für unsere Daten und unser Modell.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_808.wav", "doc_id": "WTTtiRKFZI.seg_808", "src_text": "So what we did, we extracted various statistics about coordination from the enhanced version of the Penn Treebank and see the paper \"Why wouldn't you use universal dependencies\" and these statistics confirm the observation made many times before that left conjuncts tend to be shorter.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir haben verschiedene Statistiken über die Koordination aus der erweiterten Version der Pentribank und sehen uns das Papier an, warum wir keine universellen Abhängigkeiten verwenden würden. Und diese Statistiken bestätigen die Beobachtung, dass die linken Konjunktionen tendenziell kürzer sind", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_515.wav", "doc_id": "dvGkKzmIaN.seg_515", "src_text": "Finally, the watermark needs to be transferable to the attacker's services during the model extraction process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Schließlich muss das Wasserzeichen während des Modellentnahmeverfahrens auf die Oberfläche des Angreifers übertragen werden.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_177.wav", "doc_id": "SLpqvupgvW.seg_177", "src_text": "In the first bubble, Bob says, \"Remember that song we were listening to yesterday?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In der ersten Blase sagt Bob „Denk an das Lied, das wir gestern Abend gehört haben“.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_695.wav", "doc_id": "oaOHnMCwad.seg_695", "src_text": "And we ought to do this over looking at the demographics of original data sets annotators, because, usually only a few annotators annotate each instance and because demographics are rarely collected and shared.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir werden das in den Demografien der Originaldatensätze nachschlagen, Annotatoren, weil normalerweise nur wenige Annotatoren vorhanden sind und weil die Demografien tatsächlich gesammelt und geteilt werden.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_393.wav", "doc_id": "WBLMIsdIrq.seg_393", "src_text": "And some people have suggested targeted evaluation on context-dependent translations, but these resources only support limited types of context-dependent translations and limited sets of languages since they usually rely on domain knowledge and human curation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "einige Leute haben vorgeschlagen, kontextabhängige Übersetzungen auf kontextabhängige Übersetzungen zu bewerten, aber diese Ressourcen unterstützen nur begrenzte Arten von kontextabhängigen Übersetzungen und begrenzte Sprachmengen, da sie normalerweise auf Domänwissen und menschliche Kuration angewiesen sind.", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_412.wav", "doc_id": "WBLMIsdIrq.seg_412", "src_text": "And finally, we look at different individual tokens that have high P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und schließlich sehen wir uns unterschiedliche individuelle Token an, die eine hohe PSXMI haben,", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_80.wav", "doc_id": "TVCREhgqUP.seg_80", "src_text": "To give you a teaser of the experimental results, here we compare our method with other treeless models on the COGS benchmark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Um Ihnen einen Überblick über die experimentellen Ergebnisse zu geben, vergleichen wir hier unsere Methode mit anderen Baumlos-Modellen auf der Grundlage des", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_367.wav", "doc_id": "gGbuDbHhyc.seg_367", "src_text": "As we can see, if we have 10 samples per class, direct fine-tuning starts to beat WSL approaches.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wie wir sehen können, beginnt Direct Fine-Tuning, wenn wir zehn Proben pro Klasse haben, WS-Ansätze zu schlagen.", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_792.wav", "doc_id": "WTTtiRKFZI.seg_792", "src_text": "However, this effect may be ameliorated when the direct object is very heavy and very long.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dieser Effekt kann jedoch verbessert werden, wenn das Zielobjekt sehr schwer und sehr lang ist,", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_118.wav", "doc_id": "uZBWfYjYnf.seg_118", "src_text": "And we also see that if we consider the actual elapsed time or the computational-aware time, that is the fastest strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir sehen auch, dass, wenn wir die tatsächliche Laufzeit oder die computergestützte Arbeitszeit betrachten, die FASTER-Strategie die schnellste ist.", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_388.wav", "doc_id": "WBLMIsdIrq.seg_388", "src_text": "Well, if the previous sentence was \"Things could start to get dangerous if the ministers find out\", then \"mole\" refers to a spy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn die vorherige Aussage lautete, dass die Dinge gefährlich werden könnten, wenn die Minister es herausfinden, bezieht sich „Moe“ auf einen Spion.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_510.wav", "doc_id": "dvGkKzmIaN.seg_510", "src_text": "To protect the copyright of embedding as services, one of the solutions is to embed a watermark in the provider service and detect whether another service contain the watermark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um die Urheberrechte von Embedding-Services zu schützen, kann eine der Lösungen darin bestehen, ein Wasserzeichen in den Diensten des Anbieters zu embedden und zu überprüfen, ob ein anderer Dienst das Wasserzeichen enthält.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_456.wav", "doc_id": "hgIDlKNiFM.seg_456", "src_text": "Overall, from-scratch pre-training seems to obtain higher performance on most of the tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Insgesamt scheint das Scratch-Training eine höhere Leistung bei den meisten Aufgaben zu erzielen. Allerdings können unsere Experimente,", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_212.wav", "doc_id": "SLpqvupgvW.seg_212", "src_text": "We've also shown that the models are domain-generalizable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben auch gezeigt, dass die Modelle domänenspezifisch sind, hier", "score": 1.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_75.wav", "doc_id": "TVCREhgqUP.seg_75", "src_text": "We go from left to right over the output and determine which multiset token to put in every position.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir gehen von links nach rechts über die Ausgabe und bestimmen, welchen Multisets-Token wir in jede Position setzen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_59.wav", "doc_id": "TVCREhgqUP.seg_59", "src_text": "In particular, they often fail to reproduce the systematic correspondences between input and output, such as those that are color-coded in the example.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Insbesondere funktionieren die systematischen Korrespondenzen zwischen Input und Output nicht immer. Die", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_275.wav", "doc_id": "PIZEXUFLAR.seg_275", "src_text": "OFA uses a unified vocabulary for language, image tokens and the coordinates of a bounding box.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "OFA verwendet eine einheitliche Vokabularität für Sprache, Bildsymbolen und den Koordinator einer Bindungskiste.", "score": 47.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_292.wav", "doc_id": "PIZEXUFLAR.seg_292", "src_text": "So this measures the model's ability to consistently produce the same outputs for the same task regardless of the slight variation in the wording of the instruction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die Fähigkeit des Modells zu messen, die gleichen Ergebnisse für die gleiche Aufgabe zu produzieren, unabhängig von geringfügigen Abweichungen in der Formulierung der Anweisung.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_774.wav", "doc_id": "WTTtiRKFZI.seg_774", "src_text": "So in this case, Lisa.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "in diesem Fall Ilsa. Ähnliche Ansätze", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_197.wav", "doc_id": "SLpqvupgvW.seg_197", "src_text": "For songs, we simply show a Google search link to each song and then ask the annotators to listen to at least some of each song, and read about each song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "zeigen. Für Songs zeigen wir einfach einen Google-Suchlink zu jedem Song. Und bitten Sie dann die Annotatoren, zumindest einige der Lieder anzuhören und darüber zu lesen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_676.wav", "doc_id": "oaOHnMCwad.seg_676", "src_text": "This work was done in collaboration with some folks at the University of Washington and the Allen Institute for AI, namely Sebastian Santy, Ronan Le Bras, Katharina Reinecke and Maarten Sap.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Arbeit wurde in Zusammenarbeit mit einigen Kolleginnen und Kollegen der Universität Washington und des AI-Instituts der Universität Washington durchgeführt, darunter Sebastian Santy, Ronan Labras, Caterina Rinaea und Martin Sap.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_540.wav", "doc_id": "dvGkKzmIaN.seg_540", "src_text": "We also validate the covertness of the provided embedding by visualising the embedding of sentences on four dataset [INAUDIBLE 4:39] PCA.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir validierten auch die Verschlüsselung der bereitgestellten Einbettung, indem wir die Einbettung von Sätzen auf Virtuelle-Z-V-P-A validierten.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_421.wav", "doc_id": "WBLMIsdIrq.seg_421", "src_text": "But then if we use COMET, context-aware models perform best.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Aber wenn wir kommentierte, kontextsensitive Modelle", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_15.wav", "doc_id": "aQpIWggfCo.seg_15", "src_text": "We find that all language models achieve unsatisfactory results on planning for specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir stellen fest, dass alle Linearmodelle bei der Planung für bestimmte Ziele unzufriedenstellende Ergebnisse liefern.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_432.wav", "doc_id": "hgIDlKNiFM.seg_432", "src_text": "In this presentation, we first talk about language modeling in healthcare.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In diesem Vortrag sprechen wir zunächst über Sprachmodellierung im Gesundheitswesen, dann", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_333.wav", "doc_id": "dJGfOSFgZO.seg_333", "src_text": "You can see that in the results of our experiment that several challenges still remain and have been precisely quantified.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "den Ergebnissen unseres Experiments können Sie sehen, dass mehrere Herausforderungen noch bestehen und präzise quantifiziert wurden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_543.wav", "doc_id": "dvGkKzmIaN.seg_543", "src_text": "That's all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das ist alles, danke.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_458.wav", "doc_id": "hgIDlKNiFM.seg_458", "src_text": "Which is not the case for the model based on CamemBERT weights and tokenizer, which suffer from stability issues.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies ist nicht der Fall für das Modell, das auf Kammanberwägen basiert und Stabilitätsprobleme aufweist.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_113.wav", "doc_id": "uZBWfYjYnf.seg_113", "src_text": "But also we want that they are shifted on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "diesem Plot sind. Aber auch wir wollen, dass sie auf der linken Seite stehen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_348.wav", "doc_id": "gGbuDbHhyc.seg_348", "src_text": "If we directly train neural networks on weakly labeled data, the neural networks tend to memorize the label noise and do not generalize.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "mit menschlichen Notationen vergleicht, sind die schwachen Notationen immer noch ein allgemeiner Trend. In jüngsten Arbeiten", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_42.wav", "doc_id": "aQpIWggfCo.seg_42", "src_text": "We evaluate constrained language planning ability of large language models and develop an over-generate-then-filter method for large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "bewerten die eingeschränkte Sprachplanungsfähigkeit großer Sprachmodelle und entwickeln eine übergenerierende Filtermethode für große Sprachmodelle.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_152.wav", "doc_id": "wLqFAuDnKa.seg_152", "src_text": "The insights that we gained from the human evaluation that we performed using the MQM framework said that the fluency of PaLM is comparable to state-of-the-art systems but the main difference comes from the accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Erkenntnisse, die wir aus der menschlichen Analyse gewonnen haben, die wir mit dem MQM-Rahmenwerk durchgeführt haben, sind, dass die Flüssigkeit von Palm mit dem Zustand der Systeme vergleichbar ist, aber die Hauptunterschiede kommen von der Genauigkeit.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_570.wav", "doc_id": "rISrKoXQCx.seg_570", "src_text": "So last but not least, we evaluate language models with different political leanings on hate speech detection and fake news detection to NLP applications that often involve language models and could have very significant implications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir bewerten Sprachmodelle mit unterschiedlichen politischen Ausrichtungen, Sprachprüfungen und Nachrichtenprüfungen, die Sprachmodelle beinhalten können und sehr signifikante Implikationen haben. Also sagen wir das, wenn wir", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_586.wav", "doc_id": "rISrKoXQCx.seg_586", "src_text": "Ok, great.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Okay, großartig,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_513.wav", "doc_id": "dvGkKzmIaN.seg_513", "src_text": "Second, the watermark should not degrade the utility of the provided embeddings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zweitens sollte das Watermark den Nutzen der bereitgestellten Embeddings nicht beeinträchtigen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_185.wav", "doc_id": "SLpqvupgvW.seg_185", "src_text": "We always use a simple template.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir verwenden immer eine einfache Vorlage:", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_835.wav", "doc_id": "GvEBWkLmuI.seg_835", "src_text": "So we can ask the model to generate a persona, which is a depiction of an imagined individual using a prompt like \"Imagine you are an Asian woman.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "das Modell eine Person erzeugen, die eine asiatische Frau beschreibt, die so aussieht, als ob sie sich selbst beschreiben würde.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_64.wav", "doc_id": "TVCREhgqUP.seg_64", "src_text": "Typically, this involves considerable formalism-specific pre-processing of the logical forms, for example, to handle variable symbols.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Typischerweise beinhaltet dies eine erhebliche formalismusspezifische Präprozessierung der logischen Formen, zum Beispiel zur Verarbeitung von variablen Symbolen. Auch spezielle Grammatikanalysen", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_573.wav", "doc_id": "rISrKoXQCx.seg_573", "src_text": "And vice versa, right-leaning language models are better at detecting hate speech targeting white and men, however worse at detecting hate speech targeting at black LGBTQ plus and other minority communities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "treffen. Und umgekehrt: Modellierende Sprachmodelle sind besser darin, Heusprech zu erkennen, die auf Weiß und Mann zielen, aber es ist besser, Heusprech zu erkennen, die auf Schwarz, LGBQT+ und andere Minderheitsgemeinschaften zielen.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_103.wav", "doc_id": "uZBWfYjYnf.seg_103", "src_text": "And leverage the knowledge already acquired by the model through the attention mechanism between audio input and textual output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und die Kenntnisse, die durch das Modell über den Spannungsmechanismus zwischen Audioeingabe und Text Ausgabe bereits erworben wurden,", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_528.wav", "doc_id": "dvGkKzmIaN.seg_528", "src_text": "The weight of the target embedding is proportional to the number of triggers in the sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Gewicht des Ziel-Embeddings ist proportional zur Anzahl der Auslöser in einem Satz.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_83.wav", "doc_id": "TVCREhgqUP.seg_83", "src_text": "In our paper, we solve a couple of interesting technical challenges.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In unserem Papier stellen wir einige interessante technische Herausforderungen", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_600.wav", "doc_id": "oeooqChmKK.seg_600", "src_text": "Here is an example from our data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier ist ein Beispiel aus unserem Datensatz:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_544.wav", "doc_id": "dvGkKzmIaN.seg_544", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_588.wav", "doc_id": "rISrKoXQCx.seg_588", "src_text": "Thank you for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "für deine Zeit.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_748.wav", "doc_id": "XejEJmgUmE.seg_748", "src_text": "So that is what we call as the mismatch scenario.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das ist das sogenannte Missmatch-Szenario.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_257.wav", "doc_id": "oYCKgTzTDy.seg_257", "src_text": "To sum up, we build XSemPLR, a unified benchmark for cross-lingual semantic parsing with multiple natural languages and meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zusammenfassend können wir sagen, dass wir ein Beispiel für eine einheitliche Referenz für die semantische Analyse mit mehreren natürlichen Sprachen und vielen Repräsentationen erstellen.", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_517.wav", "doc_id": "dvGkKzmIaN.seg_517", "src_text": "However, this method either not applicable to embedding as services or lack of transferability.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Methoden sind jedoch entweder nicht anwendbar für die Einbettung von Adressdiensten oder es fehlt an der Übertragbarkeit.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_544.wav", "doc_id": "dvGkKzmIaN.seg_544", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dank.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_197.wav", "doc_id": "SLpqvupgvW.seg_197", "src_text": "For songs, we simply show a Google search link to each song and then ask the annotators to listen to at least some of each song, and read about each song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In einigen Fällen zeigen wir einfach einen Google-Suchlink zu jeder Liedes und bitten die Annotatoren, zumindest einige davon zu hören.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_176.wav", "doc_id": "SLpqvupgvW.seg_176", "src_text": "The cartoon has three speech bubbles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der Cartoon hat drei Sprachblasen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_690.wav", "doc_id": "oaOHnMCwad.seg_690", "src_text": "However these works really don't look at comparing end users with the datasets and models themselves, and studying model and data set positionality is increasingly important as NLP tasks become more subjective and socially oriented, and it's challenging to characterise how these positionalities are skewed because not all decisions are documented and many models are hidden behind APIs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese Arbeiten schauen jedoch nicht wirklich darauf, Endnutzer mit den Datensätzen und Modellen selbst zu vergleichen. Das Studieren des Modells und der Positionierbarkeit ist zunehmend wichtig, um mehr subjektiv und sozial orientiert zu sein. Es ist herausfordernd, diese Positionierungen zu beschreiben, da nicht alle Entscheidungen dokumentiert sind und viele Modelle hinter APIs versteckt sind.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_58.wav", "doc_id": "TVCREhgqUP.seg_58", "src_text": "Naive seq2seq models struggle with this kind of out-of-distribution generalization and often produce outputs that are detached from the input.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Naive sequenz-zu-Sequenz-Modelle haben Schwierigkeiten mit dieser Art der Verallgemeinerung außerhalb der Verteilung und produzieren oft Ausgaben, die vom Eingabedaten getrennt sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_463.wav", "doc_id": "SUkmfOTvGi.seg_463", "src_text": "Hello everyone, my name is Shuheng.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo alle, mein Name ist Shuhung.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_483.wav", "doc_id": "SUkmfOTvGi.seg_483", "src_text": "Here we also found that more fine tuning examples, actually also leads to better generalization.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier haben wir auch festgestellt, dass mehr Feinabstimmungsbeispiele tatsächlich auch zu einer besseren Verallgemeinerung führen.", "score": 6.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_396.wav", "doc_id": "WBLMIsdIrq.seg_396", "src_text": "And second, how well do models handle these cases?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und zweitens, wie können die Modelle diese Fälle gut handhaben?", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_164.wav", "doc_id": "SLpqvupgvW.seg_164", "src_text": "\"Did you mean 'Easy on Me' or 'I Gotta Feeling'?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Frage: Meinten Sie 'Easy on me' oder 'Ich habe ein Gefühl'?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_183.wav", "doc_id": "SLpqvupgvW.seg_183", "src_text": "The first speech bubble is chosen from a few manual prompts per domain.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die erste Sprachblase wird aus ein paar manuellen Prompt pro Domäne ausgewählt.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_768.wav", "doc_id": "XejEJmgUmE.seg_768", "src_text": "And the MPP evaluation the way that we do it currently with short and single sentence input, may not fully capture the language models abstract knowledge throughout the context window.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und die MP-Beurteilung, die Art und Weise, wie wir es korrekt mit kurzen und einzelnen Satz-Eingaben durchführen, kann das abstrakte Wissen der Sprachmodelle im Kontextfenster möglicherweise", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_45.wav", "doc_id": "aQpIWggfCo.seg_45", "src_text": "Thanks for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank für Ihre Zeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_1.wav", "doc_id": "aQpIWggfCo.seg_1", "src_text": "I'm here to introduce our work \"Distilling Script Knowledge from Large Language Models for Constrained Language Planning\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ich bin hier, um unsere Arbeit vorzustellen, die die Unterscheidung von Skriptkenntnissen von Light-Language-Modellen für eingeschränkte Sprachplanung beinhaltet.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_734.wav", "doc_id": "XejEJmgUmE.seg_734", "src_text": "Which can also include grammaticality like BLiMP, SyntaxGym, or acceptability in terms of stereotypes such as CrowS pairs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Grammatikalität wie Blimp-Syntakt-Gem oder Akzeptabilität in Form von Stereotypen wie Cruass-Paare umfassen können.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_199.wav", "doc_id": "SLpqvupgvW.seg_199", "src_text": "For the recipes and books domain, we show some background text from Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Rezepte und Bücher zeigen wir etwas Hintergrundtext von Wikipedia.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_294.wav", "doc_id": "PIZEXUFLAR.seg_294", "src_text": "As we can see, instruction tuning can significantly improve OFA's performance on seen multi-modal tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wie wir sehen können, kann die Anweisungseinstellung die Leistung von O.S. auf Multimode-Aufgaben erheblich verbessern. Auch das", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_419.wav", "doc_id": "WBLMIsdIrq.seg_419", "src_text": "And finally, we use our benchmark as well as other metrics to evaluate different models on the document-level machine translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und schließlich verwenden wir unsere Benchmarks und andere Metriken, um verschiedene Modelle auf der Dokumentenebene zu bewerten.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_78.wav", "doc_id": "TVCREhgqUP.seg_78", "src_text": "We determine the third token in the output in a similar way by jumping to another multiset token.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir bestimmen den dritten Token in der Ausgabe auf ähnliche Weise, indem wir zu einem anderen Multiset-Token springen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_326.wav", "doc_id": "dJGfOSFgZO.seg_326", "src_text": "From our analysis of these evaluation results, we found that ABC-Eval behavior labels are overall more reliable than labels collected by existing methods, as measured by inter-annotator agreement on 100 doubly-labeled conversations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Aus den Ergebnissen dieser Auswertungen geht hervor, dass die A-B-E-Verhaltensetiketten insgesamt zuverlässiger sind als die Etiketten, die durch bestehende Methoden gesammelt wurden. Darüber hinaus sind A-B-C-E-Labels bei der Gesamtkonversationsqualität aussagekräftiger als Metriken, die von existierenden Methoden abgeleitet werden,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_788.wav", "doc_id": "WTTtiRKFZI.seg_788", "src_text": "So in English, as you might know, direct objects prefer to be close to the verb, while adjuncts may be further away.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist es also in englischer Sprache, wie du vielleicht weißt, so, dass ein direktes Objekt bevorzugt wird, wenn es der Verfremdung unterworfen ist, während ein Adjunkt vielleicht weiter weg ist, richtig,", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_784.wav", "doc_id": "WTTtiRKFZI.seg_784", "src_text": "Here loves to all conjuncts separately: Lisa, Bart, and Maggie.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Konjunktionen separat liefern.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_431.wav", "doc_id": "hgIDlKNiFM.seg_431", "src_text": "Hi, I am Yanis Labrak and I will present you our works on \"DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical Domains.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo, ich bin Yan Slavac und ich werde Ihnen unsere Arbeiten über Dr. Bert, ein robustes Trainingsmodell in Französisch für biomedizinische und klinische Bereiche, vorstellen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_229.wav", "doc_id": "oYCKgTzTDy.seg_229", "src_text": "The first one is Translate-Test.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "erste ist der Übersetzungs-Test:", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_879.wav", "doc_id": "GvEBWkLmuI.seg_879", "src_text": "Have a good time at ACL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "mir zuhören, haben wir eine gute Zeit in Ägypten.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_84.wav", "doc_id": "TVCREhgqUP.seg_84", "src_text": "First of all, the alignment between input and output is not given in the training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "gelöst. Zunächst ist die Ausrichtung zwischen Eingabe und Ausgabe in den Trainingsdaten nicht angegeben.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_718.wav", "doc_id": "oaOHnMCwad.seg_718", "src_text": "So we have a few recommendations for this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "es eine Position in LED und LP gibt? Wir haben also einige Empfehlungen", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_148.wav", "doc_id": "wLqFAuDnKa.seg_148", "src_text": "And their results so a better performance when using the dev data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Leistung beim Einsatz der Daten ermöglichen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_57.wav", "doc_id": "TVCREhgqUP.seg_57", "src_text": "In this example, the model has seen shallow recursion during training and is tested on an example with deeper recursion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Beispiel hat das Modell während des Trainings eine flache Rezursion gesehen und wurde auf einem Beispiel mit einer tiefen Rezursion getestet.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_259.wav", "doc_id": "oYCKgTzTDy.seg_259", "src_text": "And our results show many interesting findings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und unsere Ergebnisse zeigen viele interessante Ergebnisse", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_64.wav", "doc_id": "TVCREhgqUP.seg_64", "src_text": "Typically, this involves considerable formalism-specific pre-processing of the logical forms, for example, to handle variable symbols.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Typischerweise beinhaltet dies eine vorläufige Verarbeitung der logischen Formen, um z. B. variable Symbole zu handhaben.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_360.wav", "doc_id": "gGbuDbHhyc.seg_360", "src_text": "Otherwise, there is a large performance drop.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Andernfalls gibt es einen großen Leistungsverlust,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_472.wav", "doc_id": "SUkmfOTvGi.seg_472", "src_text": "This is a data set that we collected from Reuters News from 2020, and then annotated them with the same CoNLL-2003 annotation guidelines.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "+ + +, das wir aus den Nachrichten von Reuters aus dem Jahr 2020 gesammelt und dann mit den gleichen Anmerkungshinweisen Carneal 2003 annotiert haben.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_37.wav", "doc_id": "aQpIWggfCo.seg_37", "src_text": "This figure shows the constraint distribution of CoScript.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Abbildung zeigt die eingeschränkte Verteilung von Coscript.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_129.wav", "doc_id": "wLqFAuDnKa.seg_129", "src_text": "This involves using the latest test sets to avoid an overlap of the test data with the training data of the language model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "um eine Überlagerung der Testdaten mit den Trainingsdaten der Sprachmodelle zu vermeiden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_181.wav", "doc_id": "SLpqvupgvW.seg_181", "src_text": "And in the third speech bubble, Bob uses an indirect reference to select one of these entities, for example, \"the newer one.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Frage, und in der dritten Sprechblase wählt Bob eine direkte Referenz aus, um beispielsweise den Neueren auszuwählen. Wir stellen die ersten", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_121.wav", "doc_id": "uZBWfYjYnf.seg_121", "src_text": "Thanks for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank für Ihre Aufmerksamkeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_708.wav", "doc_id": "oaOHnMCwad.seg_708", "src_text": "We find that there is positionality in NLP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir finden, dass es Positionalität in NLP gibt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_753.wav", "doc_id": "XejEJmgUmE.seg_753", "src_text": "So how does the model do?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wie funktioniert das Modell?", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_0.wav", "doc_id": "aQpIWggfCo.seg_0", "src_text": "Hi, I'm Siyu Yuan from Fudan University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo, ich heiße Si Yuyan und komme von der Fudan-Universität.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_655.wav", "doc_id": "FLkGnzVRew.seg_655", "src_text": "We find that on transferring the zero-shot performance on the annotated data set is already much better than chance with the best, with AUC .62.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir finden, dass die Übertragung der Null-Schnitt-Leistung auf den annotierten Datensatz bereits viel besser ist als mit dem besten", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_760.wav", "doc_id": "XejEJmgUmE.seg_760", "src_text": "But when we match the structure, that is when we choose the sentences from the same phenomena in BLiMP or SyntaxGym, we see a massive increase or a massive decrease of the MPP judgement for the model, depending on whether the chosen prefix is acceptable or unacceptable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Aber wenn wir die Struktur wählen, das heißt, wenn wir die Sätze aus demselben Phänomen im Blame-Per-Satz-Grammatik wählen, dann ist das die richtige Struktur. Wir sehen einen massiven Anstieg oder eine massive Abnahme der Einschätzung des Modells durch das Parlament, abhängig davon, ob der gewählte Präfix akzeptabel oder nicht akzeptabel ist.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_151.wav", "doc_id": "wLqFAuDnKa.seg_151", "src_text": "In our case, we chose to evaluate with Google Translate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "nahe einem kommerziellen System, weshalb wir es mit Google Translate betreiben.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_273.wav", "doc_id": "PIZEXUFLAR.seg_273", "src_text": "These tasks are derived from 21 existing open-source dataset and each task is equipped with five expert written instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Aufgaben sind von einundzwanzig vorhandenen Open-Source-Datensätzen abgeleitet, und jede Aufgabe ist mit fünf zusätzlichen Anweisungen ausgestattet.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_32.wav", "doc_id": "aQpIWggfCo.seg_32", "src_text": "However, previous studies do not enable planning for specific goals and manual dataset annotation is expensive.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "dorthin. Vorherige Studien ermöglichen jedoch keine Planung für spezifische Ziele, und die manuelle Datensatzannotation ist aufwendig.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_643.wav", "doc_id": "FLkGnzVRew.seg_643", "src_text": "Studying dissonance expressed in language can also be beneficial in understanding extremism and polarization of vulnerable groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Untersuchung der Abstände in der Sprache kann auch für das Verständnis von Extremismus und Polarisierung von Gruppen sinnvoll sein.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_67.wav", "doc_id": "TVCREhgqUP.seg_67", "src_text": "For the first time, we show strong generalization to deeper recursion without relying on trees.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum ersten Mal sehen wir eine starke Verallgemeinerung, um die Rekonstruktion durchzuführen, ohne auf Tricks zurückzugreifen.", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_703.wav", "doc_id": "oaOHnMCwad.seg_703", "src_text": "We've then compared these, annotations with Social Chemistry, Delphi and GPT 4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir verglichen diese Anmerkungen dann mit Social Chemistry, Delphi und GPT-4. Wir werden", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_219.wav", "doc_id": "oYCKgTzTDy.seg_219", "src_text": "As shown in this figure, we need to translate the query in multiple natural languages using neural models to SQL, Lambda or FunQL, and etcetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wie in diesem Bild gezeigt, müssen wir die Anfrage in mehrere natürliche Sprachen übersetzen, indem wir neuere Modelle verwenden: 2, Ceql, Lmda oder FQL und", "score": 49.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_650.wav", "doc_id": "FLkGnzVRew.seg_650", "src_text": "To no surprise, the classifier performed not much better than chance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "von Unterschieden, um keine Überraschung zu erzeugen, dass die Klassifikation nicht viel besser ist als die Chance.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_805.wav", "doc_id": "WTTtiRKFZI.seg_805", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Also, was", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_768.wav", "doc_id": "XejEJmgUmE.seg_768", "src_text": "And the MPP evaluation the way that we do it currently with short and single sentence input, may not fully capture the language models abstract knowledge throughout the context window.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "sind, und dass die MPPE-Bewertung, die wir derzeit mit kurzen und einzelnen Sätzen als Eingabe durchführen, möglicherweise nicht vollständig die abstrakte Sprachmodelle-Kenntnisse im Kontextfenster erfassen.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_233.wav", "doc_id": "oYCKgTzTDy.seg_233", "src_text": "In this setting, the source language is the same as target language, for example German to German or English to English.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In dieser Konstellation ist die Quellsprache dieselbe wie die Zielsprache, zum Beispiel Deutsch zu Deutsch oder Englisch zu Englisch.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_372.wav", "doc_id": "gGbuDbHhyc.seg_372", "src_text": "To summarize, we showed that recent WSL approaches require clean, manually annotated samples for them to work properly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zusammengefasst zeigen wir, dass aktuelle WSL-Ansätze saubere, manuell annotierte Proben erfordern, damit sie ordnungsgemäß funktionieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_284.wav", "doc_id": "PIZEXUFLAR.seg_284", "src_text": "So we use pre-trained OFA large model as a base model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher verwenden wir ein vorbereitetes OFA-Modell als Basismodell;", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_722.wav", "doc_id": "oaOHnMCwad.seg_722", "src_text": "And a good example of this is the Masakhani initiative.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und ein gutes Beispiel dafür", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_123.wav", "doc_id": "wLqFAuDnKa.seg_123", "src_text": "This is joint work with my colleagues from Google Translate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies ist eine Zusammenarbeit mit meinen Kollegen von Google Translate.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_142.wav", "doc_id": "wLqFAuDnKa.seg_142", "src_text": "And when we go, as in our case, to five-shot prompting, there is nearly no difference to the actual form of the prompting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Schuss, und wenn wir, wie in unserem Fall, zum Fächerschießen gehen, gibt es kaum einen Unterschied zur tatsächlichen Form des Schießens. Es", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_421.wav", "doc_id": "WBLMIsdIrq.seg_421", "src_text": "But then if we use COMET, context-aware models perform best.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die beste Leistung aufweisen. Aber", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_163.wav", "doc_id": "SLpqvupgvW.seg_163", "src_text": "Consider this alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Betrachten Sie diese alternative", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_626.wav", "doc_id": "oeooqChmKK.seg_626", "src_text": "Additional experiments with fictional knowledge indicated even the best performing models, cannot reliably integrate backward knowledge provided only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zusätzliche Experimente mit fiktiverem Wissen zeigen, dass selbst die besten Leistungsmodelle nicht zuverlässig Hintergrundwissen, das nur zur Zeit der Erinnerung angeboten wird, integrieren können.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_595.wav", "doc_id": "oeooqChmKK.seg_595", "src_text": "Pretrained parameters can contain information about what presidents do and what a TV is but they cannot reliably know who this instance-specific entity \"John\" is, or who the new president is, because the president might have changed since pretraining.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "können Informationen über das, was Präsidenten tun und was sie sind, enthalten, aber sie können nicht zuverlässig wissen, wer diese spezifische Einheit ist oder wer der neue Präsident ist, weil der Präsident sich vielleicht während des Prä-Trainings verändert", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_777.wav", "doc_id": "WTTtiRKFZI.seg_777", "src_text": "Right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "beiden", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_42.wav", "doc_id": "aQpIWggfCo.seg_42", "src_text": "We evaluate constrained language planning ability of large language models and develop an over-generate-then-filter method for large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wir bewerten die konstrizierten Sprachplanungsfähigkeiten von großen Sprachmodellen und entwickeln ein übergenerierendes Filterverfahren für große Sprachmodelle.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_100.wav", "doc_id": "uZBWfYjYnf.seg_100", "src_text": "So what is our solution?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist unsere Lösung?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_644.wav", "doc_id": "FLkGnzVRew.seg_644", "src_text": "Finally, cognitive dissonance is important to understand personal cognitive styles of individuals and helps us understand decision making processes better.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Schließlich ist es wichtig, kontextuelle Unterschiede zu verstehen, um persönliche kontextuelle Stile von Einzelpersonen zu verstehen und uns zu helfen, Entscheidungsprozesse besser zu verstehen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_128.wav", "doc_id": "wLqFAuDnKa.seg_128", "src_text": "We evaluated the transition capability of such models using the best practices of the MT community.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir bewerten die Übersetzbarkeit von Modellen, indem wir die besten Übersetzungen der Gemeinschaft verwenden,", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_182.wav", "doc_id": "SLpqvupgvW.seg_182", "src_text": "We provide the first and second speech bubbles automatically, but the third one is filled in by the annotator.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir liefern die ersten und zweiten Sprechblasen automatisch, aber die dritte wird vom Annotator eingegeben.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_741.wav", "doc_id": "XejEJmgUmE.seg_741", "src_text": "So that is the approach.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Also ist das der Ansatz,", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_806.wav", "doc_id": "WTTtiRKFZI.seg_806", "src_text": "It violates one principle, but it satisfies another one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Es verletzt ein Prinzip, aber es erfüllt ein anderes.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_302.wav", "doc_id": "PIZEXUFLAR.seg_302", "src_text": "We also can see transfer learning from natural instruction datasets can help OFA to attain much better performance on the natural instruct dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir können auch sehen, dass das Transfer-Lernen aus dem Datensatz der natürlichen Anweisung die Leistung von OFA auf dem Datensatz der natürlichen Anweisung verbessern kann.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_29.wav", "doc_id": "aQpIWggfCo.seg_29", "src_text": "Our method greatly improves the planning ability both in semantic completeness and faithfulness to the constraint.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unsere Methode verbessert die Planbarkeit erheblich sowohl in Bezug auf semantische Vollständigkeit als auch in Bezug auf Treue zu den Einschränkungen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_712.wav", "doc_id": "oaOHnMCwad.seg_712", "src_text": "We also find most additional alignment with people who have a college education.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir finden auch heraus, dass die meisten zusätzlichen Übereinstimmungen mit Personen mit einem College-Abschluss bestehen,", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_46.wav", "doc_id": "aQpIWggfCo.seg_46", "src_text": "Please find more details of CoScript in our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Bitte finden Sie weitere Einzelheiten zu Co-Script in unseren Unterlagen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_31.wav", "doc_id": "aQpIWggfCo.seg_31", "src_text": "Creating the dataset is an essential step to this end.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das Erstellen eines Datensatzes ist ein wesentlicher Schritt auf dem Weg", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_782.wav", "doc_id": "WTTtiRKFZI.seg_782", "src_text": "And finally, there's also a multi-headed approach that's used, for example, in the Hudson's Word Grammar, where they say all conjuncts are heads of the coordinate structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und schließlich ist dies auch ein mehrfach überarbeiteter Ansatz, der beispielsweise in der Katschons-Word-grammatik verwendet wird. Wo, so zu sagen, alle Konjungate über die Koordinatenstruktur stehen,", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_776.wav", "doc_id": "WTTtiRKFZI.seg_776", "src_text": "So these two approaches are asymmetric.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ansätze gleich sind. Nun", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_268.wav", "doc_id": "PIZEXUFLAR.seg_268", "src_text": "Additionally, at the time of our research, we discovered a considerable discrepancy in the availability of instructional datasets between NLP and multi-modal.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Darüber hinaus entdeckten wir in der Zeit unserer Forschung eine erhebliche Diskrepanz in der Verfügbarkeit von Anweisungsdatensätzen zwischen L und M.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_591.wav", "doc_id": "oeooqChmKK.seg_591", "src_text": "Natural language understanding models draw on a variety of knowledge sources, such as knowledge contained in their parameters, usually acquired by a pretraining, and knowledge given in inputs at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Modelle zur Verständigung von nationalen Sprachen basieren auf einer Vielzahl von Wissensquellen, wie z. B. Wissen, das in ihren Parametern enthalten ist, das normalerweise durch eine Vorbereitung erworben wird, und Wissen, das in Eingaben bei der Zeit der", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_123.wav", "doc_id": "wLqFAuDnKa.seg_123", "src_text": "This is joint work with my colleagues from Google Translate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist eine gemeinsame Arbeit mit meinen Kollegen von Google Translate.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_467.wav", "doc_id": "SUkmfOTvGi.seg_467", "src_text": "We observe that models have been used in CoNLL-2003 to develop NER for almost 20 years and this naturally raises several problems.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir haben festgestellt, dass Modelle seit fast zwanzig Jahren für die Entwicklung von Neuronen in Corel Draw verwendet werden, und dies wirft natürlich einige Probleme auf.", "score": 3.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_506.wav", "doc_id": "dvGkKzmIaN.seg_506", "src_text": "Embedding as services is one of the services built upon large language models to assist various, NLP tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ist einer der Dienste, die auf großen Sprachmodellen basieren, um verschiedene NLP-Aufgaben zu unterstützen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_823.wav", "doc_id": "WTTtiRKFZI.seg_823", "src_text": "But when the governor is on the right this tendency disappears.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "auf der rechten Seite beobachtet.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_565.wav", "doc_id": "rISrKoXQCx.seg_565", "src_text": "And we also try to investigate whether language models can pick up the polarisation that's prevalent in our modern society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir werden auch versuchen, die Polarisierung, die in unserer modernen Gesellschaft vorherrscht, zu untersuchen.", "score": 1.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_251.wav", "doc_id": "oYCKgTzTDy.seg_251", "src_text": "The orange line is Cross-lingual Zero-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "orangefarbene Linie ist der Cross-Lingual-Zero-Shot-Transfer, während die", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_61.wav", "doc_id": "TVCREhgqUP.seg_61", "src_text": "The trees are intended to capture the compositional process that relates utterances with the logical forms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Bäume sollen das kompositionelle Verfahren erfassen, das sich mit den logischen Formen in Verbindung setzt.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_822.wav", "doc_id": "WTTtiRKFZI.seg_822", "src_text": "What we see here is that when the governor is on the left, the tendency for the left conjunct to be shorter grows steadily, with the absolute difference in words, and the same is observed when there is no governor as in coordination of sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Was wir hier sehen, ist, dass, wenn die Regierung Die Tendenz, dass die linke Konjunktion kürzer ist, wächst stetig mit der absoluten Differenz in Wörtern und dasselbe wird beobachtet, wenn es keine Gouverneur gibt, aber wenn der Gouverneur", "score": 56.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_404.wav", "doc_id": "WBLMIsdIrq.seg_404", "src_text": "We perform our analysis at three different levels.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir führen unsere Analysen auf drei verschiedenen Ebenen durch.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_190.wav", "doc_id": "SLpqvupgvW.seg_190", "src_text": "The first one is uniform at random.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die erste ist Uniform-Attrack", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_4.wav", "doc_id": "aQpIWggfCo.seg_4", "src_text": "And show that large language models can effectively decompose goals into steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und gezeigt, dass große Sprachmodelle Ziele effektiv in Schritte zerlegen können.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_569.wav", "doc_id": "rISrKoXQCx.seg_569", "src_text": "So this indicates that language models can also pick up the polarisation in our society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "im Zentrum befinden, so dass die Sprachmodelle auch die Polarisierung in unserer Gesellschaft abbilden können.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_829.wav", "doc_id": "GvEBWkLmuI.seg_829", "src_text": "This work is done in collaboration with Esin Durmus and Dan Jurafsky.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "zu messen. Diese Arbeit wird in Zusammenarbeit mit Esnader Mush und Danarovsky durchgeführt.", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_873.wav", "doc_id": "GvEBWkLmuI.seg_873", "src_text": "So based on these patterns, we conclude with three recommendations for model owners.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Auf der Grundlage dieser Muster können wir drei Empfehlungen für Modelleigentümer zusammenfassen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_72.wav", "doc_id": "TVCREhgqUP.seg_72", "src_text": "We introduce a new method to predict the permutation that does not put any hard constraints on the possible permutations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir führen eine neue Methode zur Vorhersage der Permutation ein, die keine harten Einschränkungen für die möglichen Permutationen", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_438.wav", "doc_id": "hgIDlKNiFM.seg_438", "src_text": "Since its release in 2018, BERT has become one of the most effective approach to solve natural language processing tasks and offers huge performance gains compared to historical static and contextualized methods such as Word2vec, fastText, or more.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Seit seiner Veröffentlichung im Jahr 2018 ist Bert ein effektiver Ansatz zur Lösung natürlicher Sprachverarbeitungsaufgaben und bietet im Vergleich zu historischen statischen und kontextualisierten Methoden wie Word-to-Vec, FastText oder ANW einen größeren Leistungsgewinn.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_256.wav", "doc_id": "oYCKgTzTDy.seg_256", "src_text": "Pretraining on English natural language can significantly boost the performance of Few-shot on target natural languages, and we found multilingual language models such as Codex and BLOOM are still inadequate for cross-lingual semantic parsing tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Das Training auf Englisch kann die Leistung von Few-Shot auf Ziel-Natursprachen erheblich verbessern und wir fanden heraus, dass multilinguale Sprachmodelle wie Coders und Blue immer noch unzureichend für die Überprüfung von Semantik in mehreren Sprachen sind.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_784.wav", "doc_id": "WTTtiRKFZI.seg_784", "src_text": "Here loves to all conjuncts separately: Lisa, Bart, and Maggie.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Gouverneur, hier liebt, zu allen Konjunktionen separat, diese sind aber nicht relevant.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_317.wav", "doc_id": "dJGfOSFgZO.seg_317", "src_text": "However, we believe there is a more precise and reliable strategy for dimensional dialogue evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir glauben jedoch, dass es eine genauer und zuverlässigere Strategie für die dimensionale Dialogbewertung gibt.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_671.wav", "doc_id": "FLkGnzVRew.seg_671", "src_text": "These are the links to our core data set and our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese sind die Links zu unserem Code-Datensatz und unserem Papier.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_535.wav", "doc_id": "dvGkKzmIaN.seg_535", "src_text": "We compute the similarity difference between benign and backdoor data set which is defined as delta cosine and delta L2.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir berechnen den Ähnlichkeitsunterschied zwischen dem Basis-Embedding und dem Backdoor-Embedding, das als Delta-Embedding definiert ist.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_131.wav", "doc_id": "wLqFAuDnKa.seg_131", "src_text": "We use state-of-the-art, neural MT metrics, and additionally also show expert-based human evaluation results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir verwenden state-of-the-art-neuronale MT-Metriken und zeigen zusätzlich Ergebnisse der Expertenbewertung durch Menschen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_282.wav", "doc_id": "PIZEXUFLAR.seg_282", "src_text": "We use all the instances in the test split for each task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir verwenden alle Instanzen im Test für", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_668.wav", "doc_id": "FLkGnzVRew.seg_668", "src_text": "However, the annotators also find the examples difficult.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Anmerkungen sind jedoch schwierig. In der Zusammenfassung", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_830.wav", "doc_id": "GvEBWkLmuI.seg_830", "src_text": "In recent years, many have documented the prevalence of social bias and stereotypes in large language models, or LLMs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In jüngster Zeit haben viele die Prävalenz von sozialen Vorurteilen und Stereotypen in großen Sprachmodellen dokumentiert.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_275.wav", "doc_id": "PIZEXUFLAR.seg_275", "src_text": "OFA uses a unified vocabulary for language, image tokens and the coordinates of a bounding box.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "verwendet ein einheitliches Vokabular für Sprache, Bildsymbole und Koordinaten von Begrenzungskästen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_49.wav", "doc_id": "TVCREhgqUP.seg_49", "src_text": "This is joint work with my advisors Alexander Koller and Ivan Titov.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist eine gemeinsame Arbeit mit meinen Beratern Alexander Koller und Ivan Titov.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_669.wav", "doc_id": "FLkGnzVRew.seg_669", "src_text": "In summary, we find that PRC is a simple AL strategy for rare class acquisition and cold starting AL with appropriately designed transfer learning task and help significantly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zusammenfassend finden wir, dass PRC eine einfache AL-Strategie für die Akquisition von Raritätsklassen ist und die Ko-Start-AL mit entsprechend konzipierten Transfer-Lernaufgaben erheblich unterstützen kann.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_6.wav", "doc_id": "aQpIWggfCo.seg_6", "src_text": "Planning for the goals with specific constraints, such as \"make a chocolate cake\", still remains under-studied.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Planung von Zielen mit spezifischen Einschränkungen, wie z. B. Make a Chocolate Cake, ist noch nicht untersucht.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_322.wav", "doc_id": "dJGfOSFgZO.seg_322", "src_text": "For example, ABC-Eval measures the number of turns in which a chat model ignores its partner or says something irrelevant, contradicts itself or its partner, hallucinates incorrect facts or violates common sense knowledge, and when the model succeeds or fails to show empathy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "A. C. E. die Anzahl der Umdrehungen, die ein Chat-Modell ignoriert oder relevant ist. Es widerspricht sich selbst oder seinem Partner, halluziniert unkorrekte Fakten oder verletzt das allgemeine Menschenverständnis, und wenn das Modell erfolgreich ist oder nicht, zeigt es Empathie.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_493.wav", "doc_id": "SUkmfOTvGi.seg_493", "src_text": "And these goes hand in hand, we can't just have one ingredient but throw out the others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Beispiele benötigen würden. Zur gleichen Zeit stellten", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_16.wav", "doc_id": "aQpIWggfCo.seg_16", "src_text": "Then we conduct detailed analysis to investigate why learning models fail.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "führen wir detaillierte Analysen durch, um zu untersuchen, was die linearen Modelle für die Ergebnisse verantwortlich sind. Die Ergebnisse", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_75.wav", "doc_id": "TVCREhgqUP.seg_75", "src_text": "We go from left to right over the output and determine which multiset token to put in every position.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir gehen von links nach rechts über die Ausgabe und bestimmen, welcher Multiset-Token in jede Position gesetzt werden soll.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_810.wav", "doc_id": "WTTtiRKFZI.seg_810", "src_text": "And, also the observation that was made in parsing that this tendency grows with length difference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und auch die Beobachtung, dass das Wachstum der Tendenz mit Längenunterschieden einherging. Wenn", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_751.wav", "doc_id": "XejEJmgUmE.seg_751", "src_text": "Finally, we can choose sentences from a completely unrelated domain such as Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Schließlich können wir Sätze aus einem vollkommen unabhängigen Bereich wie Wikipedia auswählen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_244.wav", "doc_id": "oYCKgTzTDy.seg_244", "src_text": "We found that Encoder-Decoder obtains the best performance on all nine datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir fanden heraus, dass Encoder-Decoder-Modelle bessere Ergebnisse erzielen als monolinguale Modelle. Wir bewerten", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_603.wav", "doc_id": "oeooqChmKK.seg_603", "src_text": "Servin and Kea met at a park.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Bäckerin. Serwin und Kiah trafen sich nach einem", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_429.wav", "doc_id": "WBLMIsdIrq.seg_429", "src_text": "Thank you so much for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Vielen Dank für die Unterstützung.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_628.wav", "doc_id": "oeooqChmKK.seg_628", "src_text": "However, with task-specific training, some models successfully integrate knowledge from multiple sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Mit spezifischem Training können einige Modelle jedoch erfolgreich Wissen aus mehreren Quellen integrieren.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_745.wav", "doc_id": "XejEJmgUmE.seg_745", "src_text": "We extract grammatical sentences from Adjunct Island and then we add it as a prefix to both the acceptable query and the unacceptable query.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir extrahieren grammatikalische Sätze aus dem Adjektiv. Und dann fügen wir es als Präfix sowohl zur akzeptablen als auch zur inakzeptablen Frage hinzu.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_45.wav", "doc_id": "aQpIWggfCo.seg_45", "src_text": "Thanks for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank für Ihre Zeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_703.wav", "doc_id": "oaOHnMCwad.seg_703", "src_text": "We've then compared these, annotations with Social Chemistry, Delphi and GPT 4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann verglichen wir diese Anmerkungen mit Social Chemistry, Delphy und GPD Four. Dann", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_580.wav", "doc_id": "rISrKoXQCx.seg_580", "src_text": "We would also like to highlight that we expose the unique dilemma regarding language model political biases.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir würden also gerne auch darauf hinweisen, dass wir das einzigartige Dilemma, das sich bei der Untersuchung der monologischen politischen Auffassungen", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_598.wav", "doc_id": "oeooqChmKK.seg_598", "src_text": "We introduce a coreference resolution task, designed to probe for the ability to draw on knowledge available in different sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir führen eine Korrelationsanalyse durch, die darauf ausgelegt ist, die Fähigkeit zu testen, Wissen aus verschiedenen Quellen abzurufen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_583.wav", "doc_id": "rISrKoXQCx.seg_583", "src_text": "If we do try to sanitaze somehow, we would also risk censorship, or exclusion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn wir versuchen, uns zu sänitieren, würden wir auch Risiken der Zensur oder Ausgrenzung laufen, und", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_109.wav", "doc_id": "uZBWfYjYnf.seg_109", "src_text": "If we go on and we receive another speech chunk, and our model predicts other three words and we will look at those cross-attention weights, we will see that no word points to the last lambda speech frames.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn wir weitergehen, erhalten wir einen weiteren Sprachblock, und unser Modell sagt uns drei weitere Wörter, und wir schauen uns die Wechselwirkung an. Wir werden erkennen, dass kein Wort auf die letzte, lambe, lambe, lambe, lambe, lambe, lambe, lambe, lambe, lambe.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_302.wav", "doc_id": "PIZEXUFLAR.seg_302", "src_text": "We also can see transfer learning from natural instruction datasets can help OFA to attain much better performance on the natural instruct dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir können auch sehen, dass das Übertragen aus den Datensätzen der natürlichen Anweisung OIF helfen kann, um eine viel bessere Leistung auf den Datensätzen der natürlichen Anweisung zu erzielen.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_730.wav", "doc_id": "XejEJmgUmE.seg_730", "src_text": "Language model acceptability judgments are not always robust to context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Akzeptanzurteile des Sprachmodells sind nicht immer robust.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_692.wav", "doc_id": "oaOHnMCwad.seg_692", "src_text": "We do this through our framework NLPositionality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir tun dies durch unser Framework.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_121.wav", "doc_id": "uZBWfYjYnf.seg_121", "src_text": "Thanks for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank für Ihre Aufmerksamkeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_219.wav", "doc_id": "oYCKgTzTDy.seg_219", "src_text": "As shown in this figure, we need to translate the query in multiple natural languages using neural models to SQL, Lambda or FunQL, and etcetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wie in der Abbildung zu sehen ist, müssen wir den Query in mehrere natürliche Sprachen übersetzen, indem wir Neuronenmodelle verwenden, wie z. B. Seq2Seq, Lambada oder Funktions-Query.", "score": 41.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_385.wav", "doc_id": "WBLMIsdIrq.seg_385", "src_text": "This work was done in collaboration with Patrick Fernandes, Emmy Liu, André F. T. Martins, and Graham Neubig.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Arbeit wurde in Zusammenarbeit mit Patrick Fennan, M.E., und Andrew F. Martens durchgeführt. Die Übersetzungen hängen also", "score": 14.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_74.wav", "doc_id": "TVCREhgqUP.seg_74", "src_text": "Conceptually, our permutation model works roughly like this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Konzeutell, unser Permutationenmodell funktioniert ungefähr so.", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_684.wav", "doc_id": "oaOHnMCwad.seg_684", "src_text": "Positionality is simply the perspectives that people hold as a result of their demographics, identity, and life experiences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ist einfach die Perspektive, die die Menschen aufgrund ihrer Demografie, Identität und Lebenserfahrungen haben.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_532.wav", "doc_id": "dvGkKzmIaN.seg_532", "src_text": "Back door data set contains sentences of which all words belong to the trigger set while all words in the sentences of benign data set do not belong to the trigger sets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Datensatz; der Datensatz der Rückwand enthält Sätze, deren alle Wörter dem Trigger-Set gehören, während alle Wörter in den Sätzen des bösartigen Datensatzes nicht dem Trigger-Set gehören.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_66.wav", "doc_id": "TVCREhgqUP.seg_66", "src_text": "In this paper, we don't use trees and introduce a neural seq2seq model that directly models the correspondences between fragments of the input and fragments of the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In diesem Papier verwenden wir keine Traces und stellen ein neues Sequenz-zu-Sequenz-Modell vor, das die Korrespondenzen zwischen den Fragmenten des Inputs und den Fragmenten des Outputs direkt modelliert.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_773.wav", "doc_id": "WTTtiRKFZI.seg_773", "src_text": "So for example, in the universal dependencies, the structure of the coordination, Lisa, Bart, and Maggie, such that the first conjunct is the head of the whole coordinate structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "dass die Struktur der Abhängigkeitskoordination Lisa Bart und Meggie ist. Es ist so, dass der erste Konjunkt der Kopf der ganzen Struktur ist.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_460.wav", "doc_id": "hgIDlKNiFM.seg_460", "src_text": "We are also observing that more specialized data is better, but it doesn't scale well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellen auch fest, dass spezialisierte Daten besser sind - mehr spezialisierte Daten sind besser - aber sie werden nicht gut genutzt.", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_19.wav", "doc_id": "aQpIWggfCo.seg_19", "src_text": "The heat map in the figure shows that the planning performance of InstructGPTs varies considerably for goals of different categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Übersicht im Bild zeigt, dass die Planungsleistung von Mädchen in verschiedenen Kategorien sehr unterschiedlich", "score": 44.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_423.wav", "doc_id": "WBLMIsdIrq.seg_423", "src_text": "This again demonstrates that it is difficult to determine the best document-level translation system if we use corpus-level metrics alone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies zeigt wieder, dass es schwierig ist, das beste Dokumenten-Übersetzungs-System zu bestimmen, wenn man nur Korpus-Ebenniveaumetriken verwendet. Jetzt", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_611.wav", "doc_id": "oeooqChmKK.seg_611", "src_text": "We have defined three settings of KITMUS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir haben drei Einstellungen für Kidmus definiert.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_205.wav", "doc_id": "SLpqvupgvW.seg_205", "src_text": "The AltEntities Corpus has 6,000 alternative questions across three domains, and it has 42,000 indirect referring expressions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "tausend alternative Fragen in drei Domänen und zweitausend indirekte Referenzäußerungen.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_337.wav", "doc_id": "dJGfOSFgZO.seg_337", "src_text": "However, this is all the more reason to pursue reliable and precise evaluation metrics for comparing models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies ist jedoch der Grund, warum wir zuverlässige und präzise Bewertungsmetriken für Vergleichsmodelle verwenden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_226.wav", "doc_id": "oYCKgTzTDy.seg_226", "src_text": "We provide a uniform data set XSemPLR for cross-lingual semantic parsing in multiple natural languages and meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "vor, wir stellen ein einheitliches Datensatz-Beispiel für das Cross-Lingual-Semantic-Parsing in mehreren natürlichen Sprachen und Bedeutungsdarstellungen bereit. Es", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_396.wav", "doc_id": "WBLMIsdIrq.seg_396", "src_text": "And second, how well do models handle these cases?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Kontext? Zweitens: Wie gut können Modelle diese Fälle handhaben?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_261.wav", "doc_id": "oYCKgTzTDy.seg_261", "src_text": "And welcome to visit our paper and code.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Aufmerksamkeit.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_10.wav", "doc_id": "aQpIWggfCo.seg_10", "src_text": "In this paper, we first evaluate and improve the constrained language planning ability of large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In diesem Papier bewerten und verbessern wir zunächst die eingeschränkte Planungsfähigkeit von Sprachmodellen in Großschreibung. Außerdem gibt", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_793.wav", "doc_id": "WTTtiRKFZI.seg_793", "src_text": "Because then it can be moved to the position after the adjunct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "weil es dann in die Position nach dem Add-on bewegt", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_496.wav", "doc_id": "SUkmfOTvGi.seg_496", "src_text": "And we found that the answer is actually a resounding yes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wir haben festgestellt, dass die Antwort tatsächlich ein lautes „Ja“ ist.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_78.wav", "doc_id": "TVCREhgqUP.seg_78", "src_text": "We determine the third token in the output in a similar way by jumping to another multiset token.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir bestimmen den dritten Token in der Ausgabe auf ähnliche Weise, indem wir zu einem anderen Multisets-Token springen;", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_810.wav", "doc_id": "WTTtiRKFZI.seg_810", "src_text": "And, also the observation that was made in parsing that this tendency grows with length difference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und auch die Beobachtung, die gemacht wurde, als sie vorüberging, dass eine Dissonanz mit langen Unterschieden wächst,", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_722.wav", "doc_id": "oaOHnMCwad.seg_722", "src_text": "And a good example of this is the Masakhani initiative.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und ein gutes Beispiel dafür", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_305.wav", "doc_id": "PIZEXUFLAR.seg_305", "src_text": "So one more thing, we are collecting a much larger multi-model instruction tuning dataset with around 150 additional vision language tasks and we will release them.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "für die Anpassung mit etwa einhundertfünfzig zusätzlichen Sprachaufgaben und geben sie heraus.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_117.wav", "doc_id": "uZBWfYjYnf.seg_117", "src_text": "And we see that it outperforms all the strategies applied to offline models since the curves are shifted over the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir sehen, dass alle Strategien, die auf Offline-Modelle angewendet werden, seit den Kurven nach links verschoben sind.", "score": 48.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_589.wav", "doc_id": "oeooqChmKK.seg_589", "src_text": "Hello everyone, I'm Akshatha, and today my co-author Martin and I are presenting our work \"The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hallo alle, ich bin Ashutosh und heute präsentieren mein Co-Autor Martin und ich unsere Arbeit, den KITMST-Test: Die Bewertung der Integration von Wissen aus mehreren Quellen.", "score": 53.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_258.wav", "doc_id": "oYCKgTzTDy.seg_258", "src_text": "We conduct a comprehensive benchmark study on three representative types of multilingual language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir führen eine umfassende Benchmark-Studie an drei repräsentativen Typen von mehrsprachigen Modellen durch", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_512.wav", "doc_id": "dvGkKzmIaN.seg_512", "src_text": "First the method should be applicable to embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Erstens sollte die Methode auf eingebettete ET-Verbindungen anwendbar sein.", "score": 52.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_203.wav", "doc_id": "SLpqvupgvW.seg_203", "src_text": "Here are some examples from our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier sind einige Beispiele aus unserem Datensatz.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_512.wav", "doc_id": "dvGkKzmIaN.seg_512", "src_text": "First the method should be applicable to embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Erstens sollte die Methode für die Einbettung von Dienstleistungen anwendbar sein.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_726.wav", "doc_id": "oaOHnMCwad.seg_726", "src_text": "But if you'd like to learn more, feel free to check out our dashboard for the most updated analysis results and our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Empfehlung. Wenn Sie jedoch mehr erfahren möchten, können Sie gerne unser Dashboard für die neuesten Analyseergebnisse und unser Papier überprüfen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_432.wav", "doc_id": "hgIDlKNiFM.seg_432", "src_text": "In this presentation, we first talk about language modeling in healthcare.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Präsentation sprechen wir zunächst über Sprachmodellierung in der Gesundheitsversorgung,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_294.wav", "doc_id": "PIZEXUFLAR.seg_294", "src_text": "As we can see, instruction tuning can significantly improve OFA's performance on seen multi-modal tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Leistung von Multimodaltasks erheblich verbessern. Auch das Transfer-Lernen aus", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_874.wav", "doc_id": "GvEBWkLmuI.seg_874", "src_text": "First, we should, as researchers, be addressing positive stereotypes and essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Eigentümer von Modellen zusammenstellen. Zunächst sollten wir als Forscher positive Stereotypen und Essentials beschreiben und", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_246.wav", "doc_id": "oYCKgTzTDy.seg_246", "src_text": "We found that Encoder-Decoder or Encoder-PTR can be improved by training in a mixture of various languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und fanden heraus, dass Encoder-Decoder oder Encoder-PDR verbessert werden kann, indem sie in einer Mischung verschiedener Sprachen trainiert werden.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_640.wav", "doc_id": "FLkGnzVRew.seg_640", "src_text": "So why does this matter?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Warum ist das also wichtig?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_272.wav", "doc_id": "PIZEXUFLAR.seg_272", "src_text": "Here we present MultiInstruct, the first multi-modal instruction tuning benchmark dataset that consists of 62 diverse multi-modal tasks covering 10 broad categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier stellen wir MultiInstrukt vor, das erste MultiModal Instruction Tuning Benchmark-Datensatz, der aus 62 verschiedenen MultiModalen Tasks besteht, die zehn Borencategorien abdecken.", "score": 21.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_759.wav", "doc_id": "XejEJmgUmE.seg_759", "src_text": "And there we see that the MPP judgments either increase or decrease significantly when you add either acceptable prefixes or unacceptable prefixes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dort sehen wir, dass die MP3-Urteile sich entweder erheblich oder unerheblich verändern, wenn man entweder akzeptable oder inakzeptable Präfixe hinzufügt.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_352.wav", "doc_id": "gGbuDbHhyc.seg_352", "src_text": "We can't stop on this problem setting, but this implies that additional manual annotations are required in weakly supervised learning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zweifel an dieser Problemstellung, da dies impliziert, dass zusätzliche manuelle Anmerkungen bei der Erstellung von Wochenplanen erforderlich sind,", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_654.wav", "doc_id": "FLkGnzVRew.seg_654", "src_text": "We transfer from two different tasks: topic independent dissonance stance classification, a task that determines if two debate statements from different people are in agreement or in disagreement, irrespective of topic, called debate here, and on binary classification of expansion and comparison classes of PDTB since these two are closely related to the conception of consonance and dissonance and we call them CE here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir übertragen aus zwei verschiedenen Themen, unabhängige Themenklassifikation, ob zwei Erklärungen von verschiedenen Personen in Übereinstimmung oder im Widerspruch zum Thema stehen. Die Debatte hier und über die binäre Klassifizierung von Expansion und Vergleichsklassen von Pente, da diese eng mit der Konzeption von Konsonanten und Dissonanzen zusammenhängen, und wir nennen sie hier.", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_54.wav", "doc_id": "TVCREhgqUP.seg_54", "src_text": "And \"Mary knew that the girl slept.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Mary wusste, dass das Mädchen geschlafen", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_147.wav", "doc_id": "wLqFAuDnKa.seg_147", "src_text": "The dev data is much more curated, and with higher quality than the training data, that it's more noisy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die dev-Daten sind viel sorgfältiger bearbeitet und haben eine höhere Qualität als die Trainingsdaten, die lauter sind, und die Ergebnisse zeigen eine", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_317.wav", "doc_id": "dJGfOSFgZO.seg_317", "src_text": "However, we believe there is a more precise and reliable strategy for dimensional dialogue evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir glauben jedoch, dass es eine präzisere und zuverlässigere Strategie für die dimensionale Dialogbewertung gibt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_320.wav", "doc_id": "dJGfOSFgZO.seg_320", "src_text": "We developed this method to comprehensively cover chat model behaviors that have been suggested to affect chat quality in recent literature.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir entwickelten diese Methode, um umfassend Chat-Modellverhaltens zu decken, die vorgeschlagen wurden, um Chat-Qualität und jüngste Literatur zu beeinflussen.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_363.wav", "doc_id": "gGbuDbHhyc.seg_363", "src_text": "Our second finding is that increasing the number of clean validation samples will help WSL approaches to achieve better performance, as shown in the figure on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unsere zweite Erkenntnis ist, dass eine Erhöhung der Anzahl der Clean-Validation-Samples den WSL-Ansätzen helfen wird, bessere Leistungen zu erzielen, wie in der Abbildung auf der linken Seite gezeigt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_854.wav", "doc_id": "GvEBWkLmuI.seg_854", "src_text": "Now for some results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Manchmal haben wir Erfolge,", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_383.wav", "doc_id": "WBLMIsdIrq.seg_383", "src_text": "Hello, my name is Kayo Yin and I will be presenting our work titled \"When Does Translation Require Context?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo, mein Name ist Kai-Ohin und ich werde unsere Arbeit präsentieren, die sich auf die Erforschung von Daten", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_253.wav", "doc_id": "oYCKgTzTDy.seg_253", "src_text": "We found that, by comparing the green and orange line, we found the Zero-shot setting, the Cross-lingual transfer performance gap is significant, and then comparing the blue and orange lines, we found that with the Few-shot setting the transfer gap is shortened rapidly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir fanden heraus, dass durch Vergleich der grünen und orangefarbenen Linie wir fanden, dass für die Einstellung ohne Schüsse die Leistungsdifferenz bei der Übersetzung zwischen Sprachen signifikant ist, und durch Vergleich der blauen und orangefarbenen Linie. Wir fanden heraus, dass mit wenigen Schritten die Transferlücke schnell geschlossen wird.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_211.wav", "doc_id": "SLpqvupgvW.seg_211", "src_text": "If the language model has access only to entity names, then the accuracy is only 60%, so there's a lot of room for improvement.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn das Sprachmodell nur Zugriff auf Entitätennamen hat, ist die Genauigkeit nur 60%, also gibt es viel Raum für Verbesserungen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_542.wav", "doc_id": "dvGkKzmIaN.seg_542", "src_text": "As shown in the figures, it's hard to distinguish between, the backdoor embeddings and normal embeddings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wie die Zahlen zeigen, ist es schwierig, zwischen den Backdoor-Embeddings und den normalen Embeddings zu unterscheiden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_336.wav", "doc_id": "dJGfOSFgZO.seg_336", "src_text": "With the rapid pace of improvement in the field, many of these error rates could see a decrease in new models released since our evaluation was conducted.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Mit dem schnellen Tempo der Verbesserung in diesem Bereich könnten viele dieser Fehler in neuen Modellen, die veröffentlicht werden, abnehmen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_521.wav", "doc_id": "dvGkKzmIaN.seg_521", "src_text": "Watermark injection and copyright verification.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wasserzeichen-Einspritzung und Urheberrechtsanwendung. Bevor wir diese", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_324.wav", "doc_id": "dJGfOSFgZO.seg_324", "src_text": "For comparison, we also evaluated these conversations using three existing methods: Likert ratings on the turn-level, Likert ratings on the dialogue-level, and dialogue-level pairwise comparisons.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Vergleich haben wir diese Gespräche auch mit drei bestehenden Methoden bewertet. Lickert-Bewertungen auf der Drehstufe, Lickert-Bewertungen auf der Dialogstufe und Dialogstufe-paareisen Vergleiche.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_433.wav", "doc_id": "hgIDlKNiFM.seg_433", "src_text": "Then we will present the main contribution of our article.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "stellen wir die wichtigsten Beiträge unserer Arbeit vor:", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_10.wav", "doc_id": "aQpIWggfCo.seg_10", "src_text": "In this paper, we first evaluate and improve the constrained language planning ability of large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In dieser Arbeit bewerten wir zunächst und verbessern die konstruktionsbezogene Sprachplanungsfähigkeit von großen Sprachmodellen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_203.wav", "doc_id": "SLpqvupgvW.seg_203", "src_text": "Here are some examples from our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier sind einige Beispiele aus unserem Datensatz.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_814.wav", "doc_id": "WTTtiRKFZI.seg_814", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist der", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_457.wav", "doc_id": "hgIDlKNiFM.seg_457", "src_text": "However, our experiment on control pre-training using the weight and tokenization of CamemBERT trained on the four GB subset of NACHOS showed comparable results to those obtained with DrBERT 4 GB from-scratch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Weiterbildung, bei denen wir das Gewicht und den Tokenizer von Pamper Bert verwenden, und trainieren, um auf dem 4-GB-Untersatz von Natures zu trainieren, zeigen vergleichbare Ergebnisse, die wir mit dem Doktor Bert von Scratch erzielt haben.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_50.wav", "doc_id": "TVCREhgqUP.seg_50", "src_text": "Compositional generalization can be understood as the ability of a learner to handle deeper recursion and unseen compositions of phrases that have been seen individually during training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die kompositorische Generalisierung kann als die Fähigkeit eines Lernenden verstanden werden, tiefere Wiederholungen und unsichtbare Kompositionen von Aussagen zu handhaben, die während des Trainings individuell gesehen wurden.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_465.wav", "doc_id": "SUkmfOTvGi.seg_465", "src_text": "Let's get started.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sie uns anfangen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_802.wav", "doc_id": "WTTtiRKFZI.seg_802", "src_text": "When you swap these two constituents, the sum of these two dependencies becomes 6.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und sie austauschen, wird die Summe dieser beiden Abhängigkeiten sechs", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_742.wav", "doc_id": "XejEJmgUmE.seg_742", "src_text": "So what we do is that to simulate these longer sequences, we revisit the data sets themselves and then we recreate sentences by choosing acceptable or unacceptable sentences from those datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "den wir verwenden, um diese längeren Sequenzen zu simulieren: Wir überprüfen die Datensätze selbst und erstellen dann Sätze aus akzeptablen oder unakzeptablen Sätzen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_877.wav", "doc_id": "GvEBWkLmuI.seg_877", "src_text": "We just really can't make any assumptions or really study that further, without more transparency.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "können wir wirklich keine Annahmen treffen oder sie weiter studieren, ohne mehr Transparenz.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_623.wav", "doc_id": "oeooqChmKK.seg_623", "src_text": "Without task-specific training on KITMUS, both models do not perform well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Modelle schnitten beim spezifischen Training auf dem Kidus nicht gut ab,", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_554.wav", "doc_id": "rISrKoXQCx.seg_554", "src_text": "To this end, we propose to investigate the political bias propagation pipeline from pretraining data to language models to downstream tasks, specifically by asking the following questions: First, how do we evaluate the political leaning of language models and what role does pretraining data might have on such political biases?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zu diesem Zweck schlagen wir vor, die politische Bias-Verbreitungspipeline von der Vorverarbeitung von Daten zu Sprachmodellen zu Downstream-Aufgaben zu untersuchen, insbesondere, indem wir die folgenden Fragen stellen: Zunächst, wie bewerten wir die politische Ausrichtung von Sprachmodellen und welche Rolle könnte die Vorverarbeitung von Daten auf solche politischen Vorurteile haben?", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_787.wav", "doc_id": "WTTtiRKFZI.seg_787", "src_text": "The argument is based on the principle of dependency length minimization that I will explain on the basis of these examples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Argument basiert auf dem Prinzip der Abhängigkeit der Minimierung, das ich auf der Grundlage dieser Beispiele erkläre. So", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_480.wav", "doc_id": "SUkmfOTvGi.seg_480", "src_text": "The second ingredient is the model size.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der zweite Bestandteil ist die Modellgröße.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_124.wav", "doc_id": "wLqFAuDnKa.seg_124", "src_text": "PaLM is a 540 billion-parameter large language model presented last year in 2022.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "PP-1M ist ein 540 Milliarden Parameter großes Sprachmodell, das im Jahr 2022 vorgestellt wurde.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_760.wav", "doc_id": "XejEJmgUmE.seg_760", "src_text": "But when we match the structure, that is when we choose the sentences from the same phenomena in BLiMP or SyntaxGym, we see a massive increase or a massive decrease of the MPP judgement for the model, depending on whether the chosen prefix is acceptable or unacceptable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Aber wenn wir die Struktur übereinstimmen lassen, dann ist das der Zeitpunkt, an dem wir die Sätze aus dem gleichen Phänomen im Blimp-Syntax auswählen. Wir sehen eine massive Zunahme oder Abnahme des MPP-Urteils für das Modell, je nachdem, ob das gewählte Präfix akzeptabel oder nicht akzeptabel ist.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_780.wav", "doc_id": "WTTtiRKFZI.seg_780", "src_text": "The conjunction headed approach assumed in Prague dependency treebanks, where coordinate structures are headed by the conjunction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Prag-Ansatz und der Konjunktions-Ansatz, die Koordinatenstrukturen, die von der Konjunktion abhängen. Daher", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_137.wav", "doc_id": "wLqFAuDnKa.seg_137", "src_text": "So, it's important to select a good prompting strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "erreichen, weshalb es wichtig ist, eine gute Strategie zu wählen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_710.wav", "doc_id": "oaOHnMCwad.seg_710", "src_text": "So for the GPT 4 social acceptability analysis, we find that it's most aligned to confucian and English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und für die GPD-Analyse finden wir heraus, dass die meisten englischsprachigen Länder am besten geeignet sind. Wir finden", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_639.wav", "doc_id": "FLkGnzVRew.seg_639", "src_text": "While dissonance is a very common phenomenon we experienced in daily decision making, they are really rare to find expressed in language among other kinds of discourse relations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Während Dissens ein sehr häufiges Phänomen ist, das wir in der täglichen Entscheidungsfindung erleben, sind sie in der Sprache in anderen Arten von Diskursbeziehungen wirklich selten ausgedrückt.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_92.wav", "doc_id": "uZBWfYjYnf.seg_92", "src_text": "Hi, I'm Sara Papi from the University of Trento and Foundazione Bruno Kessler and I will briefly introduce the \"Attention as a Guide for Simultaneous Speech Translation\" paper, that is a joint work with Matteo Negri and Marco Turchi.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo, ich bin Sera Papí von der Universität von Trento und der Bruno Kessler Foundation, und ich werde kurz die Aufmerksamkeit als Leitfaden für ein Simultansprachübersetzungs-Papier vorstellen, das eine gemeinsame Arbeit mit Matteo Negri und Marco Turci ist.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_2.wav", "doc_id": "aQpIWggfCo.seg_2", "src_text": "In everyday life, humans often plan their actions by following step-by-step instructions in the form of goal-oriented scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Im Alltag planen Menschen oft ihre Handlungen, indem sie Schritt für Schritt Anweisungen in Form von orientierten Skripten befolgen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_35.wav", "doc_id": "aQpIWggfCo.seg_35", "src_text": "In total, we generate 55,000 specific goals with scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Insgesamt generieren wir 50.000 spezifische Ziele mit Skripten,", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_652.wav", "doc_id": "FLkGnzVRew.seg_652", "src_text": "To alleviate this, we experiment over combinations of transfer learning and active learning to annotate such that more dissonant samples can be collected over lesser annotation runs, lowering the overall annotation costs while improving dissonance detection.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um dies zu erleichtern, experimentieren wir mit Kombinationen von Transferlernen und aktiven Lernvorgängen, um solche Anmerkungen zu annotieren, sodass mehr Dissensbeispiele über weniger Anmerkungsrunden gesammelt werden können, wodurch die Gesamtkosten der Anmerkung verringert und die Dissensdetektion verbessert werden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_825.wav", "doc_id": "WTTtiRKFZI.seg_825", "src_text": "So see the paper for the full arguments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Sehen Sie sich das Papier für die vollständige", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_554.wav", "doc_id": "rISrKoXQCx.seg_554", "src_text": "To this end, we propose to investigate the political bias propagation pipeline from pretraining data to language models to downstream tasks, specifically by asking the following questions: First, how do we evaluate the political leaning of language models and what role does pretraining data might have on such political biases?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zu diesem Zweck schlagen wir vor, die politische Verbreitungspipeline zu untersuchen, indem Daten vorbereitet werden, Sprachmodelle erstellt werden, Downstream-Tasks durchgeführt werden, insbesondere durch die Stellung der folgenden Fragen. Erstens: Wie können wir die politische Tendenz von Sprachmodellen bewerten, und welche Rolle könnte Perusini-Daten auf solche politischen Vorurteile haben? Zweitens,", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_122.wav", "doc_id": "wLqFAuDnKa.seg_122", "src_text": "Hello everyone, my name is David Vilar, and I will be giving a short review of the paper \"Prompting PaLM for Translation: Assessing Strategies and Performance.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo, mein Name ist Vilar und ich werde eine kurze Zusammenfassung des Papiers zu Übersetzung, Bewertung, Strategien und Leistung geben.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_162.wav", "doc_id": "SLpqvupgvW.seg_162", "src_text": "Our goal is to understand users’ language when they want to make a choice.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unser Ziel ist es, die Sprache der Benutzer zu verstehen, wenn sie eine Wahl treffen", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_665.wav", "doc_id": "FLkGnzVRew.seg_665", "src_text": "On further rounds of AL with two best strategies, we improve dissonance classification AUC to 0.75, which is the best performance that we have on the task so far.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Bei den nächsten Runden mit zwei besseren Strategien verbessern wir die Diskriminanzklassifizierung von AUC zu einem Punkt von sieben, was die beste Leistung ist, die wir bisher hatten.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_406.wav", "doc_id": "WBLMIsdIrq.seg_406", "src_text": "And this allows us to find, for example, dual pronouns in Arabic that have relatively high P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sich zum Beispiel anhand von Pronomen in Arabisch bestimmen, die einen hohen Index haben,", "score": 42.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_378.wav", "doc_id": "gGbuDbHhyc.seg_378", "src_text": "Third, continuous fine-tuning is a simple yet strong baseline that should be considered in future work in WSL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Drittens ist die kontinuierliche Feinabstimmung eine einfache, aber starke Grundlage, die in zukünftigen Arbeiten in WSL berücksichtigt werden sollte.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_263.wav", "doc_id": "PIZEXUFLAR.seg_263", "src_text": "Hello everyone, my name is Ying and my colleague Zhiyang and I will be presenting our research on MultiInstruct improving Multi-Modal Zero-Shot Learning via Instruction Tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hallo, ich heiße Yen und ich werde zusammen mit meinem Kollegen Jiajun unsere Forschung über Multi-Instruction präsentieren, die multimodale neuronale Lernfähigkeit durch Anpassung der Anweisungen verbessert.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_525.wav", "doc_id": "dvGkKzmIaN.seg_525", "src_text": "In watermark injection, we first define a target embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wasserzeicheninjektion definieren wir zunächst eine Zielverankerung.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_178.wav", "doc_id": "SLpqvupgvW.seg_178", "src_text": "And with that, Bob sets the dialogue context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und mit dieser Aussage setzt Bob den Dialogkontext.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_100.wav", "doc_id": "uZBWfYjYnf.seg_100", "src_text": "So what is our solution?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Was ist unsere Lösung?", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_856.wav", "doc_id": "GvEBWkLmuI.seg_856", "src_text": "However, when we actually look at the distribution of the words and lexicon, we find very different things.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn wir jedoch die Verteilung der Wörter im Lexikon betrachten, finden wir sehr unterschiedliche Dinge, sodass", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_383.wav", "doc_id": "WBLMIsdIrq.seg_383", "src_text": "Hello, my name is Kayo Yin and I will be presenting our work titled \"When Does Translation Require Context?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo, mein Name ist Kayen und ich werde unsere Arbeit mit dem Titel „Windows-Übersetzungskontext: Eine mehrsprachige Untersuchung“", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_358.wav", "doc_id": "gGbuDbHhyc.seg_358", "src_text": "We addressed these research questions in our work and our findings are as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir behandeln diese Forschungsfragen in unserer Arbeit und unsere Ergebnisse sind wie folgt. Zunächst stellen", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_279.wav", "doc_id": "PIZEXUFLAR.seg_279", "src_text": "Ok, now I'm going to talk about multi-modal instruction tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Okay, nun werde ich über die Abstimmung der Multimode-Anweisung sprechen.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_775.wav", "doc_id": "WTTtiRKFZI.seg_775", "src_text": "A similar approach is assumed in Igor Mel'čuk's meaning text theory, where again, the whole coordinate structure is headed by the first conjuct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wird durch die gesamte korrelierte Struktur bestimmt, so dass diese beiden", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_539.wav", "doc_id": "dvGkKzmIaN.seg_539", "src_text": "The results on four data sets show that our embedding marker can have great detection performance while keep great utility for downstream tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Ergebnisse auf vier Datensätzen zeigen, dass unser eingebetteter Marker eine ausgezeichnete Erkennungsleistung aufweist und gleichzeitig eine ausgezeichnete Nützlichkeit für Downstream-Aufgaben hat.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_868.wav", "doc_id": "GvEBWkLmuI.seg_868", "src_text": "And finally, for black women, we see that some of the top words are things like \"strong\" and \"resilient\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "schließlich sehen wir bei schwarzen Frauen, dass einige der obersten Wörter Dinge sind, die stark und widerstandsfähig sind.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_174.wav", "doc_id": "SLpqvupgvW.seg_174", "src_text": "Our data set covers three different domains: music, books, and recipes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unser Datensatz umfasst drei verschiedene Bereiche: Musik, Bücher und Rezensionen.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_710.wav", "doc_id": "oaOHnMCwad.seg_710", "src_text": "So for the GPT 4 social acceptability analysis, we find that it's most aligned to confucian and English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "So finden wir heraus, dass die Datensätze und Modelle für die GPD4-Social-Acceptability-Analyse am meisten mit Konfuzianismus und englischsprachigen Ländern ausgerichtet sind.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_123.wav", "doc_id": "wLqFAuDnKa.seg_123", "src_text": "This is joint work with my colleagues from Google Translate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist eine gemeinsame Arbeit mit meinen Kollegen von Google Translate.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_320.wav", "doc_id": "dJGfOSFgZO.seg_320", "src_text": "We developed this method to comprehensively cover chat model behaviors that have been suggested to affect chat quality in recent literature.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir entwickeln diese Methode, um Verhaltensweisen in Chat-Modellen abzubilden, die sich auf die Chat-Qualität", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_853.wav", "doc_id": "GvEBWkLmuI.seg_853", "src_text": "So for instance, for the personas of black women, we would do Fightin’ Words and compare the log-odds ratios against both white personas and man personas because those are the two corresponding unmarked groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel für die Personas von schwarzen Frauen würden wir Fighting Words verwenden und die Logits-Raten gegenüber sowohl weißen Personas als auch männlichen Personas vergleichen, weil es sich um die zwei korrespondierenden unmarkierten Gruppen handelt.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_369.wav", "doc_id": "gGbuDbHhyc.seg_369", "src_text": "As we can see from the figures, the vanilla model, termed FTw, initially underperforms more complicated WSL methods, like COSINE.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wie aus den Zahlen ersichtlich, schneidet das Wallina-Modell mit der Bezeichnung FTW anfangs schlechter ab als komplexere WSL-Methoden wie Kosinus.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_770.wav", "doc_id": "XejEJmgUmE.seg_770", "src_text": "Thank you for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Vielen Dank für Ihre Aufmerksamkeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_815.wav", "doc_id": "WTTtiRKFZI.seg_815", "src_text": "So the governor is on the left in this example \"I saw Bart and Lisa\" so is the governor is on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "diesem Beispiel der Gouverneur links ist, aber nicht in dem", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_457.wav", "doc_id": "hgIDlKNiFM.seg_457", "src_text": "However, our experiment on control pre-training using the weight and tokenization of CamemBERT trained on the four GB subset of NACHOS showed comparable results to those obtained with DrBERT 4 GB from-scratch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "meisten Aufgaben zu führen. Unsere Experimente mit kontinuierlicher Gewichtsabtastung mit dem Gewichts- und Token-System von Pommert zeigen vergleichbare Ergebnisse mit denen von Dr. Pommert.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_149.wav", "doc_id": "wLqFAuDnKa.seg_149", "src_text": "Nevertheless, specialized state-of-the-art systems have a substantial advantage over the PaLM translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "haben spezialisierte Systeme einen erheblichen Vorteil gegenüber den", "score": 1.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_439.wav", "doc_id": "hgIDlKNiFM.seg_439", "src_text": "Since then, this model has been adapted to many other languages, like in French with CamemBERT, and also in domains like biomedical with PubMedBERT and BioBERT and on clinical with ClinicalBERT, but mostly in English.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Seitdem wurde dieses Modell an viele andere Sprachen angepasst, z. B. an Französisch mit Camber und andere Domänen wie Biomedizin mit Pamet und Biot sowie klinisch mit klinischen Begriffen, aber größtenteils auf Englisch. Spezialisierte Modelle für", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_618.wav", "doc_id": "oeooqChmKK.seg_618", "src_text": "In the Background-Pretrain setting, we assume that the background knowledge \"Politicians seek elected seats in government\" is contained in the pretrained parameters and in inference-time context we provide the entity-specific knowledge \"Chichester is a politician.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Im Hintergrundwissen nehmen wir an, dass die Hintergrundkenntnisse von Politikern, die Sitze in der Regierung anstreben, in den Hintergrundparametern enthalten sind. Im Kontext der Gegenwart liefern wir das spezifische Wissen, dass Chester ein Politiker", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_342.wav", "doc_id": "gGbuDbHhyc.seg_342", "src_text": "In this video, I would like to present our recent work \"Weaker Than You Think: A Critical Look at Weakly Supervised Learning.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In diesem Video möchte ich unsere jüngste Arbeit präsentieren, eine kritische Betrachtung der wöchentlichen Nachrichten.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_449.wav", "doc_id": "hgIDlKNiFM.seg_449", "src_text": "Another also based on CamemBERT, but trained this time on the 4 GB of clinical notes and finally, one based on English biomedical model PubMedBERT, and trained on 4 GB of set of NACHOS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "dem Gewicht von Camembert, aber trainiert diesmal auf vier Gigabyte von Klicks. Insgesamt haben wir sieben Modelle. Um alle sieben Modelle zu bewerten, haben wir öffentliche und private Aufgaben wie Name-Recognition, Klassifizierung, Part-of-Speech-Tagging und Fragen beantworten. Dieses Modell entspricht sechs Zeilen des Modells,", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_702.wav", "doc_id": "oaOHnMCwad.seg_702", "src_text": "Afterwards to stay engaged in the study, they can compare their responses to an AI and others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Danach können sie die Antworten von A und anderen vergleichen.", "score": 38.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_744.wav", "doc_id": "XejEJmgUmE.seg_744", "src_text": "And what we do is that to recreate like longer sequences and which are acceptable and which has the same matching of the grammatical structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In unserem Fall erzeugen wir längerer Sequenzen, die akzeptabel sind und die gleiche grammatische Struktur haben,", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_480.wav", "doc_id": "SUkmfOTvGi.seg_480", "src_text": "The second ingredient is the model size.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der zweite Bestandteil ist die Modellgröße.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_650.wav", "doc_id": "FLkGnzVRew.seg_650", "src_text": "To no surprise, the classifier performed not much better than chance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Es war kein Wunder, dass der Klassifikator nicht viel besser als zufällig leistete.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_780.wav", "doc_id": "WTTtiRKFZI.seg_780", "src_text": "The conjunction headed approach assumed in Prague dependency treebanks, where coordinate structures are headed by the conjunction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "den Konjunktionskopf-Ansatz in pragmatischen Abhängigkeitstrees, wo koordinierte Strukturen vom Konjunkt angeführt werden.", "score": 44.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_641.wav", "doc_id": "FLkGnzVRew.seg_641", "src_text": "Studying cognitive dissonance can help us understand the effects of disagreement among people, track trends and belief values, and attitude changes in population.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Untersuchung kognitiver Differenzen kann uns helfen, die Auswirkungen von Meinungsverschiedenheiten unter Menschen zu verstehen, Trends in Überzeugungen, Werten und Einstellungen in der Bevölkerung zu verfolgen und die Auswirkungen von sozialen und kulturellen Veränderungen auf die kognitive Landschaft zu verstehen.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_249.wav", "doc_id": "oYCKgTzTDy.seg_249", "src_text": "We also compare the cross-language performance gap.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir vergleichen auch die Cross-Lang Performance Gap.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_322.wav", "doc_id": "dJGfOSFgZO.seg_322", "src_text": "For example, ABC-Eval measures the number of turns in which a chat model ignores its partner or says something irrelevant, contradicts itself or its partner, hallucinates incorrect facts or violates common sense knowledge, and when the model succeeds or fails to show empathy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zum Beispiel misst A.B.C. E.V.A.L. die Anzahl der Drehungen, in denen ein Chat-Modell seinen Partner ignoriert oder etwas Irrelevantes sagt. Widerspricht sich selbst oder seiner Partner, halluziniert falsche Tatsachen oder verletzt das Wissen des Menschen, und wenn das Modell die Empathie zeigt oder nicht zeigt, ist es erfolgreich oder scheitert.", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_402.wav", "doc_id": "WBLMIsdIrq.seg_402", "src_text": "Now we analyze words with high P-CXMI to look for patterns between these words.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sind. Jetzt analysieren wir die Wörter mit hoher Häufigkeit, um die Paare zwischen diesen Wörtern zu finden.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_262.wav", "doc_id": "oYCKgTzTDy.seg_262", "src_text": "Thanks for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "fürs Zuhören.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_541.wav", "doc_id": "dvGkKzmIaN.seg_541", "src_text": "The legend of the figures means the number of triggers in each sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Legende der Figuren bedeutet die Anzahl der Auslöser in jedem Satz.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_334.wav", "doc_id": "dJGfOSFgZO.seg_334", "src_text": "For example, the bots we tested have common sense violations in around 20% of their responses.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zum Beispiel haben die Roboter, die wir getestet haben, in etwa 20% ihrer Antworten Verstöße gegen den Grundsatz der Menschenwürde gezeigt.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_77.wav", "doc_id": "TVCREhgqUP.seg_77", "src_text": "Then we jump to the next multiset token, to determine the second token in the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann springen wir zum nächsten Multiset-Token, um den zweiten Token im Output zu bestimmen.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_186.wav", "doc_id": "SLpqvupgvW.seg_186", "src_text": "Do you mean A or B?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Meinen Sie A oder B?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_122.wav", "doc_id": "wLqFAuDnKa.seg_122", "src_text": "Hello everyone, my name is David Vilar, and I will be giving a short review of the paper \"Prompting PaLM for Translation: Assessing Strategies and Performance.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo Irland, mein Name ist Aidan Villar und ich werde Ihnen eine kurze Zusammenfassung des Papiers vorstellen, das sich mit der Leistungsfähigkeit der Übersetzung befasst, die Strategien und Leistung.", "score": 31.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_598.wav", "doc_id": "oeooqChmKK.seg_598", "src_text": "We introduce a coreference resolution task, designed to probe for the ability to draw on knowledge available in different sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir stellen eine Korreferenz-Resolution-Aufgabe vor, die dazu konzipiert ist, die Fähigkeit zu testen, auf Kenntnisse in verschiedenen Quellen zurückzugreifen:", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_745.wav", "doc_id": "XejEJmgUmE.seg_745", "src_text": "We extract grammatical sentences from Adjunct Island and then we add it as a prefix to both the acceptable query and the unacceptable query.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "indem wir Grammatik-Sätze aus dem Adversarial-Tool extrahieren und sie als Präfix zu sowohl der akzeptablen als auch der nicht akzeptablen Frage", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_173.wav", "doc_id": "SLpqvupgvW.seg_173", "src_text": "We're not aware of a larger-scale public data set for the task, so we collect one using crowd annotation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir sind uns nicht bewusst, dass es ein öffentliches Datensatzsystem gibt, ein groß angelegtes Datensatzsystem für Aufgaben, also sammeln wir einen, indem wir die Minderheit benutzen;", "score": 43.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_510.wav", "doc_id": "dvGkKzmIaN.seg_510", "src_text": "To protect the copyright of embedding as services, one of the solutions is to embed a watermark in the provider service and detect whether another service contain the watermark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um die Urheberrechte an eingebetteten Diensten zu schützen, ist eine der Lösungen, ein Wasserzeichen in den Dienst des Anbieters einzubetten und zu prüfen, ob ein anderer Dienst das Wasserzeichen enthält.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_146.wav", "doc_id": "wLqFAuDnKa.seg_146", "src_text": "In particular, we compare the selecting prompts from the training data for the WMT evaluations on the dev data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "hoher Qualität auszuwählen, insbesondere die aus den Trainingsdaten der WMT-Evaluierungen oder den Testdaten.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_124.wav", "doc_id": "wLqFAuDnKa.seg_124", "src_text": "PaLM is a 540 billion-parameter large language model presented last year in 2022.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Parm ist ein fünfundvierzig Milliarden Parameter großes Sprachmodell, das im vergangenen Jahr vorgestellt wurde.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_65.wav", "doc_id": "TVCREhgqUP.seg_65", "src_text": "Obtaining trees may also involve specialized grammar-induction procedures.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "von Bäumen kann auch spezielle Grammatikinduktionsprozesse beinhalten.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_153.wav", "doc_id": "wLqFAuDnKa.seg_153", "src_text": "So, in particular, the most common errors are omission errors.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Insbesondere sind die häufigsten Fehler Auslassungsfehler.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_701.wav", "doc_id": "oaOHnMCwad.seg_701", "src_text": "We host 2 tasks on lab in the wild, one of them being social acceptability, and the way this works is that participants will read a situation from the social chemistry dataset and, then they'll write how socially acceptable a situation is.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir führen zwei Tests durch, die soziale Akzeptanz zu ermitteln, und die Art und Weise, wie diese Tests durchgeführt werden, ist", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_617.wav", "doc_id": "oeooqChmKK.seg_617", "src_text": "Here's an example of how we control the availability of facts in the true sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier ist ein Beispiel dafür, wie man die Verfügbarkeit von Fakten in echten Quellen steuern kann.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_391.wav", "doc_id": "WBLMIsdIrq.seg_391", "src_text": "However, evaluating how well models can translate cases like this is pretty hard.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Übersetzung ändert sich ebenfalls. Die Bewertung, wie gut Modelle solche Fälle übersetzen können,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_154.wav", "doc_id": "wLqFAuDnKa.seg_154", "src_text": "So, it seems that PaLM chooses to produce a better-sounding translation, sometimes by dropping parts of the source sentence that are made in translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Es scheint, dass Palme sich dafür entscheidet, eine bessere Übersetzung zu produzieren, indem sie manchmal Teile der verarbeiteten Satzzeilen aus der Übersetzung entfernt.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_601.wav", "doc_id": "oeooqChmKK.seg_601", "src_text": "Servin is a judge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Servin ist ein", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_285.wav", "doc_id": "PIZEXUFLAR.seg_285", "src_text": "During training, we mix all the instances for all the tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Während des Trainings mischen wir alle Instanzen für alle Aufgaben,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_270.wav", "doc_id": "PIZEXUFLAR.seg_270", "src_text": "However, there is no large-scale publicly-available multi-modal instruction task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "es gibt keinen großen, öffentlich verfügbaren multimodalen Anweisungsdatensatz. Daher motivierte", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_87.wav", "doc_id": "TVCREhgqUP.seg_87", "src_text": "We address this by inducing the alignment as part of the training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir behandeln dies, indem wir die Anpassung als Teil der Ausbildung einleiten.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_277.wav", "doc_id": "PIZEXUFLAR.seg_277", "src_text": "We follow the method from OFA and formulate all the tasks in a unified sequence-to-sequence format.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir folgen der Methode von OFA und formulieren alle Aufgaben in einem einheitlichen Sequenz-zu-Sequenz-Format,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_819.wav", "doc_id": "WTTtiRKFZI.seg_819", "src_text": "However, when the governor is on the right, as here, \"laughed\" governs the coordination Ted and Ned, this effect disappears.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "jedoch, dass, wenn die Regierung auf der rechten Seite ist, hier links regiert die Koordination, aber nicht umgekehrt.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_262.wav", "doc_id": "oYCKgTzTDy.seg_262", "src_text": "Thanks for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "präsentieren.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_68.wav", "doc_id": "TVCREhgqUP.seg_68", "src_text": "Our approach predicts the output from the input in two steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unser Ansatz berechnet den Output aus dem Input in zwei Schritten. Zuerst", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_863.wav", "doc_id": "GvEBWkLmuI.seg_863", "src_text": "And these words define these groups only by their relationship to their identity and distinguish them as different from the white norm.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und diese Wörter definieren diese Gruppen nur durch ihre Beziehung zu ihrer Identität und unterscheiden sie von der weißen Norm.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_505.wav", "doc_id": "dvGkKzmIaN.seg_505", "src_text": "Currently, large language models such as GPT, LLAMA, PALM are exceptional in natural language understanding and generation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "zunächst die Hintergründe über die Einbettung von Dienstleistungen vorstellen. Aktuell sind große Sprachmodelle wie TPT, LAMA, PALM hervorragend in der", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_632.wav", "doc_id": "FLkGnzVRew.seg_632", "src_text": "Hello, my name is Vasudha and I'm a Computer Science PhD candidate at Stony Brook University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo, mein Name ist Vasudha und ich bin Informatik-Doktorand an der Stony Brook University.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_23.wav", "doc_id": "aQpIWggfCo.seg_23", "src_text": "Then, InstructGPT over-generates K scripts for specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Anweisung „instructed GpT“ Käseskripte für bestimmte Ziele.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_814.wav", "doc_id": "WTTtiRKFZI.seg_814", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Gouverneur also", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_640.wav", "doc_id": "FLkGnzVRew.seg_640", "src_text": "So why does this matter?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Warum ist das wichtig?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_483.wav", "doc_id": "SUkmfOTvGi.seg_483", "src_text": "Here we also found that more fine tuning examples, actually also leads to better generalization.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier haben wir auch festgestellt, dass mehr Feintuning-Beispiele zu einer besseren Generalisierung führen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_21.wav", "doc_id": "aQpIWggfCo.seg_21", "src_text": "Thus, we adopt the idea of over-generate-then-filter to improve generation quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Daher haben wir die Idee der Übererzeugung eines Filters übernommen, um die Erzeugungsgüte zu verbessern.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_770.wav", "doc_id": "XejEJmgUmE.seg_770", "src_text": "Thank you for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank, dass Sie zuhören.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_217.wav", "doc_id": "oYCKgTzTDy.seg_217", "src_text": "So, semantic parsing is a task to build semantic representations of user queries such as SQL and Lambda Calculus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Analyse ist die Aufgabe, semantische Darstellungen von Benutzeranfragen wie z.B. SQL und Lambda-Kalkül zu erstellen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_440.wav", "doc_id": "hgIDlKNiFM.seg_440", "src_text": "Specialized models for other languages are scarce and are often based on continual pre-training due to the lack of in-domain data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Spezialisierte Modelle für andere Sprachen sind knapp und basieren häufig auf kontinuierlichem Training aufgrund des Mangels an Domänen-Daten. Bis jetzt gab es jedoch kein", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_86.wav", "doc_id": "TVCREhgqUP.seg_86", "src_text": "In addition, sometimes there are multiple permutations that are consistent with the data, but the linguistically correct one is latent.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Darüber hinaus gibt es manchmal mehrere Mutationen, die mit den Daten konsistent sind, aber die linguistisch korrekte ist latent.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_125.wav", "doc_id": "wLqFAuDnKa.seg_125", "src_text": "It's trained on a large collection of text, comprising 780 billion tokens.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Es trainiert eine große Sammlung von Texten, die über", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_763.wav", "doc_id": "XejEJmgUmE.seg_763", "src_text": "So we did a series of analysis where we tried to perturb the input sentence by, trying to preserve the relevant structure but adding like noise to the input.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir eine Reihe von Analysen durch, bei denen wir versuchen, den Input-Satz durch das Aufrechterhalten der relevanten Struktur zu stören, aber wie Lärm zu dem Input hinzufügen", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_179.wav", "doc_id": "SLpqvupgvW.seg_179", "src_text": "In the second speech bubble, Alice says, \"Do you mean 'Easy on Me' or 'I Gotta Feeling'?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dem zweiten Sprachblasen sagt Alice: „Meinst du, ich bin leicht zu haben, oder habe ich ein Gefühl dafür?“", "score": 1.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_808.wav", "doc_id": "WTTtiRKFZI.seg_808", "src_text": "So what we did, we extracted various statistics about coordination from the enhanced version of the Penn Treebank and see the paper \"Why wouldn't you use universal dependencies\" and these statistics confirm the observation made many times before that left conjuncts tend to be shorter.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben sehr viele Statistiken über Koordination aus der erweiterten Version von Penney's Bank und dem Paper, warum man Universitätsabhängigkeiten nicht verwendet, und diese Statistiken bestätigen die Beobachtung, dass Konjunkte tendenziell kürzer werden,", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_781.wav", "doc_id": "WTTtiRKFZI.seg_781", "src_text": "So, we get some dependencies from end to all the conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "erhalten wir die Abhängigkeiten von Ende bis zu allen Kontrahenten.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_126.wav", "doc_id": "wLqFAuDnKa.seg_126", "src_text": "At the time of publication, it achieved state-of-the-art in hundreds of NLP tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "achtzig Milliarden Token umfassen. Die Veröffentlichung erreicht den Stand der Kunst in Hunderten von NRP-Aufgaben.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_534.wav", "doc_id": "dvGkKzmIaN.seg_534", "src_text": "The cosine and L2 similarity between the requested embedding and the target embedding are computed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Ähnlichkeit zwischen dem angeforderten Embedding und dem Ziel-Embedding wird berechnet.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_488.wav", "doc_id": "SUkmfOTvGi.seg_488", "src_text": "This means that every unit of improvement that we made, on CoNLL-2003 translates to more than one unit improvement on CoNLL++ which means that there is no diminishing returns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies bedeutet, dass jede Verbesserungseinheit, die wir bei Color Plus vorgenommen haben, mehr als eine Verbesserungseinheit bei Color Plus bedeutet, was bedeutet, dass es keine sinkenden Erträge gibt.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_527.wav", "doc_id": "dvGkKzmIaN.seg_527", "src_text": "The provided embedding is a weight summation of the target embedding and the original embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die bereitgestellte Einbettung ist eine Zusammenfassung der Ziel-Einbettung und der Original-Einbettung. Das", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_774.wav", "doc_id": "WTTtiRKFZI.seg_774", "src_text": "So in this case, Lisa.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hi,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_267.wav", "doc_id": "PIZEXUFLAR.seg_267", "src_text": "Therefore, in this work we want to investigate whether instruction tuning a multi-modal pre-trained models can actually improve generalisation to unseen multi-modal tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Daher wollen wir in dieser Arbeit untersuchen, ob die Anweisung von mehrmodalen Protraintypen tatsächlich die Generalisierung zu mehrmodalen Aufgaben verbessern kann.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_683.wav", "doc_id": "oaOHnMCwad.seg_683", "src_text": "Design biases like the one that we just saw before might occur due to the positionality of the NLP researchers and model developers.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wie die, die wir vorher gesehen haben, ich ermutige die Position der NLP-Forscher und Modellentwickler, die", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_549.wav", "doc_id": "rISrKoXQCx.seg_549", "src_text": "Political news media are well covered in their pretraining data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "trainiert, und politische Nachrichtenmedien sind gut in ihren Vorbereitungsdaten abgedeckt.", "score": 54.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_493.wav", "doc_id": "SUkmfOTvGi.seg_493", "src_text": "And these goes hand in hand, we can't just have one ingredient but throw out the others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "benötigen, und diese Ziele gehen Hand in Hand, wir können nicht nur einen Bestandteil haben, sondern alle anderen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_202.wav", "doc_id": "SLpqvupgvW.seg_202", "src_text": "For example, the one with the piano music.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel der mit der", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_295.wav", "doc_id": "PIZEXUFLAR.seg_295", "src_text": "Also, transfer learning from natural instruction dataset can benefit instruction tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Auch das Transfer-Lernen aus natürlichen Anweisungsdatensätzen kann die Anpassung von Anweisungen begünstigen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_158.wav", "doc_id": "wLqFAuDnKa.seg_158", "src_text": "Thank you very much.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ansehen, vielen Dank.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_545.wav", "doc_id": "dvGkKzmIaN.seg_545", "src_text": "Welcome to discuss with us.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "um mit euch zu diskutieren.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_219.wav", "doc_id": "oYCKgTzTDy.seg_219", "src_text": "As shown in this figure, we need to translate the query in multiple natural languages using neural models to SQL, Lambda or FunQL, and etcetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wie in dieser Abbildung dargestellt, müssen wir die Anfrage in mehreren natürlichen Sprachen mit neuronalen Modellen übersetzen: zwei, Ceco, Lamba oder Fon usw.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_791.wav", "doc_id": "WTTtiRKFZI.seg_791", "src_text": "Because here between the verb and the direct object is an adjunct: \"yesterday\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Abend zwischen dem Verb und dem direkten Objekt ein Komma gesetzt wurde.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_72.wav", "doc_id": "TVCREhgqUP.seg_72", "src_text": "We introduce a new method to predict the permutation that does not put any hard constraints on the possible permutations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir stellen eine neue Methode zur Vorhersage der Permutation vor, die keine harten Einschränkungen für die möglichen Permutationen", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_140.wav", "doc_id": "wLqFAuDnKa.seg_140", "src_text": "We saw that the actual form of the prompting doesn't have a big influence in the case of several short promptings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir sahen, dass die tatsächliche Form der Anregung keinen großen Einfluss hat, wenn es um mehrere Anregungen geht. Es", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_766.wav", "doc_id": "XejEJmgUmE.seg_766", "src_text": "That is, when we perturb the sentences in the acceptable domain, we see similar increase in all the perturbations and when we perturb the sentences in the unacceptable domain, we see decrease in MPP judgments in similar fashion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Das heißt, wenn wir die Sätze in der akzeptablen Domäne stören, sehen wir einen ähnlichen Anstieg aller Störungen, und wenn wir die Sätze in der unakzeptablen Domäne stören, sehen wir einen ähnlichen Rückgang der MP-P-Sprüche in", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_411.wav", "doc_id": "WBLMIsdIrq.seg_411", "src_text": "And similarly, we find that context is important to translate in the right formality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und ähnlich finden wir, dass ein Kontext unterstützt wird, der den Transit in der richtigen Formalität ermöglicht. Und", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_617.wav", "doc_id": "oeooqChmKK.seg_617", "src_text": "Here's an example of how we control the availability of facts in the true sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier ist ein Beispiel dafür, wie wir die Verfügbarkeit von Fakten in Quellen kontrollieren.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_872.wav", "doc_id": "GvEBWkLmuI.seg_872", "src_text": "More broadly, we find that the words for each marked group pretty much just reflect very essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wir feststellen, dass die Wörter für jede Markenklasse sich ziemlich einfach nur sehr grundlegende Geschichten widerspiegeln.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_228.wav", "doc_id": "oYCKgTzTDy.seg_228", "src_text": "And to better evaluate our benchmark, we consider the six settings for training and evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Sprachfamilien. Und um unsere Benchmarks besser einzuschätzen, betrachten wir die sechs Einstellungen für Training und Bewertung. Zuerst", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_186.wav", "doc_id": "SLpqvupgvW.seg_186", "src_text": "Do you mean A or B?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Meinen Sie A oder B?", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_434.wav", "doc_id": "hgIDlKNiFM.seg_434", "src_text": "We introduce the first biomedical model in French named DrBERT, which is based on RoBERTa and trained on NACHOS, which is a data set of medical crawled data from the web.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir stellen den ersten biomedizinischen Modell in französisch vor, der Bert basiert und auf NACHOS trainiert wurde, einem Datensatz medizinischer Crowdsourced-Daten vom Web. Wir stellen außerdem einen Vergleich", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_50.wav", "doc_id": "TVCREhgqUP.seg_50", "src_text": "Compositional generalization can be understood as the ability of a learner to handle deeper recursion and unseen compositions of phrases that have been seen individually during training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Kompositorische Verallgemeinerung kann als die Fähigkeit des Lernenden verstanden werden, tiefere Rekursionen und unsichtbare Kompositionen von Phrasen zu handhaben, die während des Trainings einzeln gesehen wurden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_645.wav", "doc_id": "FLkGnzVRew.seg_645", "src_text": "To the goal of creating a cognitive dissonance resource, we conducted a large scale annotation of dissonance relations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Ziel der Erstellung eines kognitiven Dissensressourcen führten wir eine große Annotierung von Dissensbeziehungen durch.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_178.wav", "doc_id": "SLpqvupgvW.seg_178", "src_text": "And with that, Bob sets the dialogue context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und mit dem, was Bob sagt, setzt Alice den Dialogkontext.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_385.wav", "doc_id": "WBLMIsdIrq.seg_385", "src_text": "This work was done in collaboration with Patrick Fernandes, Emmy Liu, André F. T. Martins, and Graham Neubig.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese Arbeit wurde in Zusammenarbeit mit Patrick Frenan, MLE, Andre F. T. Martins und Graham Neubig durchgeführt.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_799.wav", "doc_id": "WTTtiRKFZI.seg_799", "src_text": "So the reasoning here is that this is possible because even though this sentence violates the general grammatical principle that direct objects should be next to the verb, it satisfies the principle of dependency length minimization, which says that shorter dependencies are preferred.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Begründung hierfür ist, dass dies möglich ist, weil dieser Satz gegen das allgemeine grammatische Prinzip verstößt, dass ein direktes Objekt dem Subjekt folgen muss. Es erfüllt das Prinzip der Abhängigkeitslängenminimierung, was bedeutet, dass kürzere Abhängigkeiten bevorzugt werden.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_577.wav", "doc_id": "rISrKoXQCx.seg_577", "src_text": "For example, if right-leaning language models were to be fine-tuned on hate speech or misinformation or whatever and deployed to a popular social media platform, this would mean that, people with opposite political opinions might be marginalised and hate speech targeting minority groups might just run rampant without any control.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel, wenn es sich um rechtschreibende Sprachmodelle handelt, sollten diese in Bezug auf die Aussprache oder Informationen verfeinert und auf beliebten Social-Media-Plattformen veröffentlicht werden. Dies würde bedeuten, dass Menschen mit gegensätzlichen politischen Meinungen marginalisiert werden könnten und die Hassrede gegen Minderheitengruppen ohne jegliche Kontrolle um", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_550.wav", "doc_id": "rISrKoXQCx.seg_550", "src_text": "According to a survey of the C4 Corpus, we can see that New York Times, Los Angeles Times, The Guardian, Huffington Post, etcetera are well covered in language model training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "einer Untersuchung der vier Korporationen, können die New York Times, die Los Angeles Times, The Guardian, Huffington Post usw. in Sprachmodellierungstrainingdaten enthalten. Dies hat eine Mischung aus", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_120.wav", "doc_id": "uZBWfYjYnf.seg_120", "src_text": "And we also released open source the code and models and simultaneous output to facilitate the reproducibility of our work.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und wir haben auch den Quellcode, die Modelle und die simultane Ausgabe veröffentlicht, um die Wiederverwendbarkeit unserer Arbeit zu erleichtern.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_480.wav", "doc_id": "SUkmfOTvGi.seg_480", "src_text": "The second ingredient is the model size.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "werden. Das zweite Element ist die Modellgröße:", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_347.wav", "doc_id": "gGbuDbHhyc.seg_347", "src_text": "When compared to human annotations, the weaker annotations are much cheaper, yet they are also noisy, meaning that a certain amount of the annotations are incorrect.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn man sie", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_456.wav", "doc_id": "hgIDlKNiFM.seg_456", "src_text": "Overall, from-scratch pre-training seems to obtain higher performance on most of the tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wenn wir mehr Daten verwenden. Insgesamt scheint das Scratch-Free-Training zu einer besseren Leistung bei den", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_345.wav", "doc_id": "gGbuDbHhyc.seg_345", "src_text": "In weak supervision, you do not manually label the data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Bei der schwachen Überwachung kennzeichnen wir die Daten nicht manuell, sondern kennzeichnen", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_269.wav", "doc_id": "PIZEXUFLAR.seg_269", "src_text": "There exist more than 1600 language-only instruction tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Es gibt mehr als tausend und sechshundert", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_40.wav", "doc_id": "aQpIWggfCo.seg_40", "src_text": "We find that T5 fine-tuned on CoScript can generate scripts of higher quality than most large language models, indicating that smaller models can surpass larger models when properly trained on suitable datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Mit Hilfe von T. F. File und Ph. Fontaine können Skripte von höherer Qualität generiert werden, was bedeutet, dass kleinere Modelle größere Modelle unterstützen können, wenn sie auf geeigneten Datensätzen trainiert werden.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_310.wav", "doc_id": "dJGfOSFgZO.seg_310", "src_text": "And today we'll tell you all about ABC-Eval, a new dimensional approach to evaluating conversational AI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und heute werden wir Ihnen alles über ABC Eval erzählen, einen neuen Ansatz zur Bewertung von konversationeller KI.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_284.wav", "doc_id": "PIZEXUFLAR.seg_284", "src_text": "So we use pre-trained OFA large model as a base model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also verwenden wir ein vorgebildetes großes Modell von OpenAI als Basismodell.", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_520.wav", "doc_id": "dvGkKzmIaN.seg_520", "src_text": "Embedding marker contains two main steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der eingebettete Marker enthält zwei Hauptschritte:", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_356.wav", "doc_id": "gGbuDbHhyc.seg_356", "src_text": "Second, if clean data is required, or if clean data is mandatory for WSL to work, then how many clean samples do we need?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wenn saubere Daten erforderlich sind oder saubere Daten erforderlich sind, damit die WSL funktioniert, wie viele saubere Proben brauchen wir? Schließlich", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_409.wav", "doc_id": "WBLMIsdIrq.seg_409", "src_text": "We then look at vocabulary items that have high P-CXMI averaged over all of its different occurrences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir schauen uns Vokabulare an, die in allen ihren verschiedenen Situationen häufig hohe p sex-mi-Ausdrücke enthalten.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_114.wav", "doc_id": "uZBWfYjYnf.seg_114", "src_text": "And we compare with popular strategies that are also applied to offline models that are the Wait-k strategy and the Local Agreement.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und wir vergleichen sie mit geeigneten Strategien, die auch auf Online-Modelle angewendet werden, wie der Whitkey-Strategie und der lokalen Vereinbarung,", "score": 6.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_492.wav", "doc_id": "SUkmfOTvGi.seg_492", "src_text": "Our conclusion is that, for good generalization we would need a better model architecture, larger model size, as well as more fine tuning examples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unsere Schlussfolgerung ist, dass wir für eine gute Generalisierung eine bessere Modellarchitektur, eine größere Modellschicht und auch mehr feine Beispiele für die Anpassung benötigen.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_30.wav", "doc_id": "aQpIWggfCo.seg_30", "src_text": "Since large language models are costly to deploy, it's essential to enable language planning ability of smaller and specialized models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Da es teuer ist, große Sprachmodelle einzusetzen, ist es wichtig, kleinere und spezialisierte Modelle für die Sprachplanung zu ermöglichen.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_681.wav", "doc_id": "oaOHnMCwad.seg_681", "src_text": "Where prospective AP is really not as sensitive to offensive terms that are more common in Indian contexts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Perspektive von AP ist wirklich nicht sensibel für offensichtliche Begriffe oder häufiger in indischen Kontexten.", "score": 49.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_576.wav", "doc_id": "rISrKoXQCx.seg_576", "src_text": "There are a bunch of more examples in the appendix to further highlight that this indicates that there is a fairness issue that is very pressing regarding the political biases of language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "in den sozialen Kategorien verwendet, gibt es eine Vielzahl von weiteren Beispielen in den Anmerkungen, um das zu unterstreichen. Dies deutet darauf hin, dass es sich um eine Gerechtigkeitsfrage handelt, die in Bezug auf die politischen Vorurteile von Sprachmodellen sehr", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_104.wav", "doc_id": "uZBWfYjYnf.seg_104", "src_text": "That is the cross-attention mechanism, and you can see an example on the right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sind in der Tat sehr gut, und Sie können ein Beispiel dafür sehen.", "score": 29.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_474.wav", "doc_id": "SUkmfOTvGi.seg_474", "src_text": "We evaluated them on both the CoNLL-03 test sets and the CoNLL++.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "2003 fein justiert und sie sowohl auf dem Konrad 3-Testset als auch auf dem Konrad + Testset", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_165.wav", "doc_id": "SLpqvupgvW.seg_165", "src_text": "Here, a user wants to select between one of these two songs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier möchte der Benutzer zwischen diesen beiden Liedern wählen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_629.wav", "doc_id": "oeooqChmKK.seg_629", "src_text": "Still, even the best-performing models seem to have difficulties with reliably integrating backward knowledge presented only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Nichtsdestotrotz scheinen selbst die besten Modelle Schwierigkeiten zu haben, zuverlässig integriertes Backward-Knowledge zu präsentieren.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_24.wav", "doc_id": "aQpIWggfCo.seg_24", "src_text": "Next, a filter model is developed to select the faithful scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Überschriften. Als nächstes wird ein Filtermodell entwickelt, um die physikalischen Skripte auszuwählen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_792.wav", "doc_id": "WTTtiRKFZI.seg_792", "src_text": "However, this effect may be ameliorated when the direct object is very heavy and very long.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dieser Effekt kann jedoch verbessert werden, wenn das Direktoberfläche sehr schwer und sehr lang ist,", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_719.wav", "doc_id": "oaOHnMCwad.seg_719", "src_text": "First one is keep a record of all relevant design choices throughout the research process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "für dies: Zuerst ist es wichtig, einen Verlauf aller relevanten Designentscheidungen während des Forschungsprozesses zu führen", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_334.wav", "doc_id": "dJGfOSFgZO.seg_334", "src_text": "For example, the bots we tested have common sense violations in around 20% of their responses.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zum Beispiel haben die Bots, die wir getestet haben, noch immer Verstöße gegen den Common Sense in etwa 20 Prozent ihrer Antworten.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_700.wav", "doc_id": "oaOHnMCwad.seg_700", "src_text": "Compared to the platforms like M Turk which largely have participants from the US or India and further Lab in the Wild still is able to get high quality data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "im Gegensatz zu Plattformen wie EmTurk, die überwiegend Teilnehmer aus den USA und Indien haben, und weiterhin können wir auf der Plattform Life on the World hochwertige Daten erhalten.", "score": 51.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_289.wav", "doc_id": "PIZEXUFLAR.seg_289", "src_text": "If the task is a multi-model classification task, we report accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn die Aufgabe eine multimodale Klassifizierungsaufgabe ist, berichten wir über Genauigkeit, wenn", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_704.wav", "doc_id": "oaOHnMCwad.seg_704", "src_text": "We then replicate a very similar setup for the toxicity and hate speech detection task, where they'll read an instance from Dynahate and write whether they think it's instance of hate speech.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dann wurden wir für die Aufgabe der Erkennung von toxischen und heiklen Sprachphänomenen sehr ähnlich aufgestellt, wobei sie einen Fall aus der Dänischen Sprache und Recht, ob sie denken, dass es ein Sprachphänomen ist, erkannten. Wir verglichen dann diese", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_296.wav", "doc_id": "PIZEXUFLAR.seg_296", "src_text": "Here we can see, as the amount of task increases, the model achieves better performance and in the meantime, lower sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier können wir sehen, dass, wenn die Anzahl der Aufgaben zunimmt, das Modell eine bessere Leistung erzielt und in der Zwischenzeit", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_504.wav", "doc_id": "dvGkKzmIaN.seg_504", "src_text": "Let's first introduce the background about embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dem Backdoor-Wasserzeichen. Lassen Sie uns", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_646.wav", "doc_id": "FLkGnzVRew.seg_646", "src_text": "We used dissonance-first approach, as seen in the flow chart here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir verwendeten den Dissens-First-Ansatz, wie in diesem Flussdiagramm zu", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_276.wav", "doc_id": "PIZEXUFLAR.seg_276", "src_text": "Here we show some example instances from our MultiInstruct dataset, to unify the processing of various input and output data types.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier zeigen wir einige Beispielinstanzen aus unserem Multi-Insta-Datensatz. Um die Verarbeitung verschiedener Eingabe- und Ausgabedatenarten zu vereinheitlichen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_443.wav", "doc_id": "hgIDlKNiFM.seg_443", "src_text": "To answer this question, we compare DrBERT with our ChuBERT model, which is based on anonymized data obtained from the Nantes University Hospital data warehouse.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zur Beantwortung dieser Frage vergleichen wir Dr. Bert mit unserem Shubert-Modell, das auf anonymisierten Daten aus dem Universitätskrankenhaus basiert.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_748.wav", "doc_id": "XejEJmgUmE.seg_748", "src_text": "So that is what we call as the mismatch scenario.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "also das nennen wir das", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_823.wav", "doc_id": "WTTtiRKFZI.seg_823", "src_text": "But when the governor is on the right this tendency disappears.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wenn der Konjunktiv auf der rechten Seite steht.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_44.wav", "doc_id": "aQpIWggfCo.seg_44", "src_text": "We hope the CoScript dataset can be a valuable resource to advance research on language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir hoffen, dass das CoScript-Datensatz ein wertvolles Ressourcen zur Fortschritt der Forschung auf Sprachplanung sein kann.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_859.wav", "doc_id": "GvEBWkLmuI.seg_859", "src_text": "And in fact, this lexicon doesn't really capture many of the harmful patterns that we saw in the earlier slides well at all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und tatsächlich erfasst dieser Lexikon nicht wirklich viele der schädlichen Muster, die wir in den früheren Seiten gesehen haben,", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_616.wav", "doc_id": "oeooqChmKK.seg_616", "src_text": "For example, because new occupations have developed since the time of pretraining.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel, weil neue Berufe seit der Zeit der Vorbereitung entwickelt wurden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_238.wav", "doc_id": "oYCKgTzTDy.seg_238", "src_text": "And we also consider Cross-lingual Zero-shot and Few-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir betrachten auch die Übertragung von Zero-Shot und Fein-Schot", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_714.wav", "doc_id": "oaOHnMCwad.seg_714", "src_text": "However, when models and data sets are aligned to specific populations, some are inevitably left behind.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "mit Personen mit einem College-Abschluss übereinstimmt.", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_817.wav", "doc_id": "WTTtiRKFZI.seg_817", "src_text": "Here we have coordination of two verbs and there's no outsides, external governor.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die Koordinierung zweier Verben und es gibt keinen Außen-Außen-Gouverneur, der", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_127.wav", "doc_id": "wLqFAuDnKa.seg_127", "src_text": "In this work, we present the first systematic study of large language model prompting for machine translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Werk präsentieren wir die erste systematische Studie des großsprachigen Modells für die maschinelle Übersetzung.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_110.wav", "doc_id": "uZBWfYjYnf.seg_110", "src_text": "This means that these three words will be emitted.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Das bedeutet, dass diese drei Wörter ausgelassen werden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_365.wav", "doc_id": "gGbuDbHhyc.seg_365", "src_text": "But that's not the end of the story, because if we either way decide to access clean samples, then training on them directly will even achieve better performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Aber das ist nicht das Ende der Geschichte, denn wenn wir uns entscheiden, Proben direkt zu trainieren, werden wir sogar bessere Ergebnisse erzielen.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_436.wav", "doc_id": "hgIDlKNiFM.seg_436", "src_text": "Then, we present our results on 11 biomedical and clinical downstream tasks in French.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dann präsentieren wir unsere Ergebnisse in elf biomedizinischen und klinischen Downstream-Aufgaben. Und schließlich kommen", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_654.wav", "doc_id": "FLkGnzVRew.seg_654", "src_text": "We transfer from two different tasks: topic independent dissonance stance classification, a task that determines if two debate statements from different people are in agreement or in disagreement, irrespective of topic, called debate here, and on binary classification of expansion and comparison classes of PDTB since these two are closely related to the conception of consonance and dissonance and we call them CE here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir übertragen aus zwei verschiedenen Themen, die unabhängige Klassifizierung der Themen, die zwei Abstimmungen von verschiedenen Personen sind, die in Übereinstimmung oder in Übereinstimmung mit dem Thema sind. „Debatten hier“ und „in der binären Klassifizierung von Erweiterung und Vergleich von Klassen von Punkt- und Dissonanzen“ sind eng mit der Konzeption von Konsonanzen und Dissonanzen verbunden, und wir nennen sie hier „CE“.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_116.wav", "doc_id": "uZBWfYjYnf.seg_116", "src_text": "These are all the results of the simultaneous speech translation strategy on German.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies sind alle Ergebnisse der simultanen Sprachübersetzung-Strategie auf Deutsch.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_820.wav", "doc_id": "WTTtiRKFZI.seg_820", "src_text": "So we showed that by measuring length in characters, the first column, in syllables the middle column, and in words the right column.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir zeigen dies, indem wir die Länge in den Buchstaben der ersten Spalte in Silben, der mittleren Spalte und in den Wörtern der rechten Spalte messen, um", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_794.wav", "doc_id": "WTTtiRKFZI.seg_794", "src_text": "This is illustrated here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die Position bewegt werden kann. Das ist hier dargestellt,", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_578.wav", "doc_id": "rISrKoXQCx.seg_578", "src_text": "So this has sound the alarm for us to acknowledge and tackle the fairness issues resulting by language model political leanings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "richtet, ohne jede Kontrolle auslaufen könnte. Daher klingt das wie eine Warnung, dass wir uns bewusst machen und die Probleme, die durch", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_335.wav", "doc_id": "dJGfOSFgZO.seg_335", "src_text": "They produce irrelevant information in around 15% of the responses, and they contradict themselves or their partner around 10% of the time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Sie produzieren irrelevanten Informationsgehalt in etwa fünfzehn Prozent ihrer Antworten, und sie widersprechen sich selbst oder ihrer Partnerin in etwa zehn Prozent der Zeit.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_666.wav", "doc_id": "FLkGnzVRew.seg_666", "src_text": "We also check the feasibility of each strategy for annotation quality and costs to annotators.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir überprüfen außerdem die Machbarkeit jeder Strategie für die Qualität und Kosten der Annotationen für die Annotatoren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_23.wav", "doc_id": "aQpIWggfCo.seg_23", "src_text": "Then, InstructGPT over-generates K scripts for specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "die übergeordnete GPT-Kasusgruppen für spezifische Ziele.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_301.wav", "doc_id": "PIZEXUFLAR.seg_301", "src_text": "As we can see by transfer learning from natural instruction datasets, the model can achieve much better sensitivity compared to the original OFA model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wie wir sehen können, kann das Modell durch Transfer-Learning aus natürlichen Anweisungs-Datensätzen eine viel bessere Sensitivität erreichen, verglichen mit dem ursprünglichen OFA-Modell. 1", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_870.wav", "doc_id": "GvEBWkLmuI.seg_870", "src_text": "And while it sounds positive at first glance, there's been work showing that this kind of archetype actually is very harmful because it puts a lot of pressure on these demographics to be resilient and strong against societal obstacles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und obwohl es klingt, als wäre es positiv, wenn man es zum ersten Mal sieht, gibt es Arbeit, die zeigt, dass dieser Art von Archetyp tatsächlich sehr schädlich ist, weil er viel Druck auf diese Demografien ausübt, um resistent und stark gegen soziale Hindernisse zu sein,", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_824.wav", "doc_id": "WTTtiRKFZI.seg_824", "src_text": "And we show in the paper how this provides an argument against asymmetric structures of coordination, as these two, and for the symmetric structures, as these two.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und wir zeigen in dem Papier, wie dies eine Argumentation gegen asymmetrische Strukturen der Koordinierung wie diese beiden und für asymmetrische Strukturen wie diese beiden bietet.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_299.wav", "doc_id": "PIZEXUFLAR.seg_299", "src_text": "As we can see, using more instructions can improve the model's overall performance and reduce its sensitivity a lot.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sehen, dass die Verwendung mehrerer Anweisungen die Gesamtleistung des Modells verbessert und seine Empfindlichkeit verringert.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_678.wav", "doc_id": "oaOHnMCwad.seg_678", "src_text": "You might turn towards a popular API like Prospective API for toxicity detection, and this works really well if you're Carl Jones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sie könnten sich an eine beliebte API wie eine Perspektiv-API für die Detektion von Toxizität wenden, und das funktioniert wirklich gut, wenn Sie Carl Jones sind,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_144.wav", "doc_id": "wLqFAuDnKa.seg_144", "src_text": "The summary of our experimental results is that the example quality is more important than the similarity to the source sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Teil des Gewichts tragen. 2. Die Zusammenfassung unserer experimentellen Ergebnisse lautet, dass die Qualität der Proben wichtiger ist als die Ähnlichkeit zur Quellsatz. Es", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_77.wav", "doc_id": "TVCREhgqUP.seg_77", "src_text": "Then we jump to the next multiset token, to determine the second token in the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dann springen wir zum nächsten Multisets-Token, um den zweiten Token im Ausgang zu bestimmen.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_629.wav", "doc_id": "oeooqChmKK.seg_629", "src_text": "Still, even the best-performing models seem to have difficulties with reliably integrating backward knowledge presented only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "scheinen selbst die besten Modelle Schwierigkeiten mit zuverlässig integriertem Rückwärtswissen zu haben, das nur zu bestimmten Zeiten präsentiert wird.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_284.wav", "doc_id": "PIZEXUFLAR.seg_284", "src_text": "So we use pre-trained OFA large model as a base model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Daher verwenden wir das vorbereitete OFA-Modell als Basismodell,", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_820.wav", "doc_id": "WTTtiRKFZI.seg_820", "src_text": "So we showed that by measuring length in characters, the first column, in syllables the middle column, and in words the right column.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Länge in Zeichen, das erste Kolumne in Silben, das mittlere Kolumne und das rechte Kolumne in Wörtern, können wir zeigen, dass", "score": 41.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_848.wav", "doc_id": "GvEBWkLmuI.seg_848", "src_text": "So the Marked Words method draws upon the sociolinguistic concept of \"markedness\", which states that there is an unmarked default, and any group that differs from that default is linguistically marked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Markwörter-Methode stützt sich auf das soziolinguistische Konzept der Markierung, das besagt, dass es sich um eine unmarkierte Form handelt und jede Gruppe, die von dieser Form abweicht, ist soziolinguistisch markiert.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_778.wav", "doc_id": "WTTtiRKFZI.seg_778", "src_text": "They single out one of the conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sind, sie sich also aus den Konjugaten", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_239.wav", "doc_id": "oYCKgTzTDy.seg_239", "src_text": "We train on one source language and transfer to another language.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "zwischen einer Quellsprache und der Transferierung in eine andere Sprache.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_428.wav", "doc_id": "WBLMIsdIrq.seg_428", "src_text": "To summarize, we perform a data-driven analysis across 14 language pairs to identify when translations require context and then we use our findings to build a benchmark for document-level machine translation which can help us identify which discourse phenomena models can handle well or not, and which translation systems are good at document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zusammenfassend führen wir eine Daten-Driven-Analyse über vierzehn Sprachpaare durch, um zu identifizieren, wann eine Übersetzung erforderlich ist. Dann verwenden wir unsere Ergebnisse, um einen Benchmark für die Dokumentenübertragung auf Maschinenebene zu erstellen, der uns helfen kann, zu erkennen, welche Diskursphänomene von Modellen gut oder schlecht übertragen werden können, und welche Übertragungssysteme wir für die Dokumentenübertragung gut verwenden. Vielen Dank,", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_229.wav", "doc_id": "oYCKgTzTDy.seg_229", "src_text": "The first one is Translate-Test.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "verwenden wir Google Translate API,", "score": 7.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_673.wav", "doc_id": "FLkGnzVRew.seg_673", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wenn Sie Fragen haben.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_484.wav", "doc_id": "SUkmfOTvGi.seg_484", "src_text": "To our next question, what causes the performance drop of some models, We had two hypothesis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zur nächsten Frage: Was verursacht den Leistungsabfall einiger Modelle? Wir haben zwei Hypothesen,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_804.wav", "doc_id": "WTTtiRKFZI.seg_804", "src_text": "That's why this sounds quite okay.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "deshalb klingt es recht gut,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_199.wav", "doc_id": "SLpqvupgvW.seg_199", "src_text": "For the recipes and books domain, we show some background text from Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Rezept- und Buchdomäne zeigen wir einige Hintergrundtexte aus Wikipedia.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_198.wav", "doc_id": "SLpqvupgvW.seg_198", "src_text": "Here's for example, the Google search result for the song \"Easy on Me.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier ist zum Beispiel das Google-Suchergebnis für das Lied Easy. Für die", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_761.wav", "doc_id": "XejEJmgUmE.seg_761", "src_text": "Now this and this is very large like this effect, increases throughout the context length and this would probably affect like newer language models which has large context window.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Nun, das ist sehr groß, wie sich dieser Effekt über die Kontextlänge ausdehnt, und würde wahrscheinlich neuere Sprachmodelle mit großen Kontextfenstern beeinflussen.", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_849.wav", "doc_id": "GvEBWkLmuI.seg_849", "src_text": "So for instance, the word \"warrior\" is usually associated with men.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wort Mann verbunden, und wenn jemand", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_658.wav", "doc_id": "FLkGnzVRew.seg_658", "src_text": "Next, we determine the best method to update a model with new data from each round of active learning and annotations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "starten. Nächstens bestimmen wir die beste Methode, um ein Modell mit neuen Daten aus jeder Runde des aktiven Lernens und der Annotationen zu aktualisieren:", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_492.wav", "doc_id": "SUkmfOTvGi.seg_492", "src_text": "Our conclusion is that, for good generalization we would need a better model architecture, larger model size, as well as more fine tuning examples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unsere Schlussfolgerung ist, dass wir für eine gute Verallgemeinerung eine bessere Modellarchitektur, eine größere Modellgröße sowie mehr fein abgestimmte", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_391.wav", "doc_id": "WBLMIsdIrq.seg_391", "src_text": "However, evaluating how well models can translate cases like this is pretty hard.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Bewertung, wie gut Modelle solche Fälle übersetzen können,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_644.wav", "doc_id": "FLkGnzVRew.seg_644", "src_text": "Finally, cognitive dissonance is important to understand personal cognitive styles of individuals and helps us understand decision making processes better.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Schließlich ist kognitive Diskrepanz wichtig, um persönliche kognitive Stile von Individuen zu verstehen und uns dabei zu helfen, Entscheidungsprozesse besser zu verstehen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_18.wav", "doc_id": "aQpIWggfCo.seg_18", "src_text": "We dig into a more fine-grained topic categories of constraints defined in wikiHow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir gehen in mehr spezialisierte Themenkategorien von Einschränkungen ein, die je nach Arbeitsumgebung variieren;", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_399.wav", "doc_id": "WBLMIsdIrq.seg_399", "src_text": "And this is done by measuring how much information the context C provides about the target Y, given the source X. You can think of CXMI as the information gained from giving context to the model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "indem die Menge der Informationen, die der Kontext C über das Ziel Y bietet, gemessen wird, gegeben diese Quelle X. Sie können CXMI als die Informationen denken, die aus dem Hintergrund des Modells gewonnen werden.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_496.wav", "doc_id": "SUkmfOTvGi.seg_496", "src_text": "And we found that the answer is actually a resounding yes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "tatsächlich lautete die Antwort ein eindeutiges Ja.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_547.wav", "doc_id": "rISrKoXQCx.seg_547", "src_text": "Today I'm presenting our work \"From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ich unsere Arbeit von der Entwicklung von Sprachmodellen bis hin zu Downstream-Aufgaben, die die Spuren politischer Parteien verfolgen. Die Sprachmodelle werden anhand von Web-Crawl-Daten", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_482.wav", "doc_id": "SUkmfOTvGi.seg_482", "src_text": "And last but not least, we all know that the number of fine tuning examples directly affects the performance of a downstream task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und zuletzt, aber nicht zuletzt, wissen wir alle, dass die Anzahl der feinjustierten Beispiele direkt auf die Leistung einer Abwärtsaufgabe wirkt:", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_715.wav", "doc_id": "oaOHnMCwad.seg_715", "src_text": "An example of this is that datasets and models are less aligned to non binary people compared to the men and women counterparts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ein Beispiel hierfür ist, dass die Daten die Modelle weniger zu nicht binären Personen im Vergleich zu den männlichen und weiblichen Gegenparten verbinden:", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_814.wav", "doc_id": "WTTtiRKFZI.seg_814", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Gouverneur in", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_21.wav", "doc_id": "aQpIWggfCo.seg_21", "src_text": "Thus, we adopt the idea of over-generate-then-filter to improve generation quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Daher übernehmen wir die Idee des übergenerierten Filters zur Verbesserung der Ausgangsqualität. Zunächst zeigen wir eingeschränkte", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_368.wav", "doc_id": "gGbuDbHhyc.seg_368", "src_text": "Finally, the performance improvement claimed in previous WSL approaches can be easily achieved by allowing to continue fine-tuning on the clean validation samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Schließlich kann die behauptete Leistungsverbesserung bei früheren WSL-Ansätzen leicht erreicht werden, indem eine weitere Feinabstimmung der reinen Validierungsmuster erlaubt wird.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_298.wav", "doc_id": "PIZEXUFLAR.seg_298", "src_text": "We use one instruction versus 5 instruction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wir haben eine Anweisung gegenüber fünf Anweisungen verwendet, wie", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_318.wav", "doc_id": "dJGfOSFgZO.seg_318", "src_text": "Our approach attempts to reduce the subjectivity of human evaluation by explicitly annotating whether or not each model response expresses certain behaviors, such as responding with irrelevant information or contradicting itself.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unsere Herangehensweise versucht, die Subjektivität der menschlichen Bewertung, indem sie explizit annotiert, ob oder nicht jede Modellantwort bestimmte Verhaltensweisen ausdrückt, wie zum Beispiel das Antworten mit irrelevanten Informationen oder sich selbst zu widersprechen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_705.wav", "doc_id": "oaOHnMCwad.seg_705", "src_text": "We then compared these annotations with Dynahate, Perspective API, Rewire API, Hate Roberta and GPT 4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "mit Dynaheat, Perspektiv-AP, Rewire-AP, Heat-Roberta und GPT-Four. Insgesamt gibt", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_786.wav", "doc_id": "WTTtiRKFZI.seg_786", "src_text": "OK.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Okay,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_403.wav", "doc_id": "WBLMIsdIrq.seg_403", "src_text": "And we perform our analysis on transcripts of TED talks that have been translated from English to 14 different languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und wir führen unsere Analyse auf Transkripte von TED-Talks durch, die von Englisch in 14 verschiedene Sprachen übersetzt wurden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_95.wav", "doc_id": "uZBWfYjYnf.seg_95", "src_text": "And what are the problems of the current SimulST models?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und was sind die Probleme der aktuellen Simulationsmodelle?", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_508.wav", "doc_id": "dvGkKzmIaN.seg_508", "src_text": "However, recent works have shown that the attacker may steal the model through learning from the embedding and provide similar services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Arbeiten gezeigt, dass der Angreifer das Modell durch das Lernen aus der Einbettung und das Bereitstellen ähnlicher Dienste stehlen kann.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_842.wav", "doc_id": "GvEBWkLmuI.seg_842", "src_text": "To capture these patterns, our method has two parts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "diesen Partnern gehört unsere Methode, die aus zwei Teilen", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_327.wav", "doc_id": "dJGfOSFgZO.seg_327", "src_text": "In addition, ABC-Eval labels are more predictive of the overall conversation quality compared to metrics produced by existing methods, as shown by this simple linear regression analysis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wie durch die einfache Regressionsanalyse gezeigt. Zum Beispiel können Sie sehen, wie sich die Proportion von Wendungen", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_162.wav", "doc_id": "SLpqvupgvW.seg_162", "src_text": "Our goal is to understand users’ language when they want to make a choice.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unser Ziel ist es, die Sprache der Benutzer zu verstehen, wenn sie eine Wahl treffen", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_315.wav", "doc_id": "dJGfOSFgZO.seg_315", "src_text": "Therefore, you might want to evaluate multiple dimensions of chat quality to understand the strengths and weaknesses of the model on a finer-grained level.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "daher möchten Sie möglicherweise die verschiedenen Dimensionen der Chatkompetenz bewerten, um die Vor- und Nachteile des Modells auf einer feineren Granularität zu verstehen.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_236.wav", "doc_id": "oYCKgTzTDy.seg_236", "src_text": "For example, we put the German, English, Chinese queries together to train a multilingual model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel stellen wir die deutschen, englischen und chinesischen Fragen zusammen, um ein mehrsprachiges Modell zu trainieren,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_249.wav", "doc_id": "oYCKgTzTDy.seg_249", "src_text": "We also compare the cross-language performance gap.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir vergleichen außerdem die Leistungsdifferenz bei der Übersetzung zwischen Sprachen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_666.wav", "doc_id": "FLkGnzVRew.seg_666", "src_text": "We also check the feasibility of each strategy for annotation quality and costs to annotators.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir prüfen auch die Anwendbarkeit jeder Strategie für Anmerkungen, Qualität und Kosten für Anmerkungen.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_336.wav", "doc_id": "dJGfOSFgZO.seg_336", "src_text": "With the rapid pace of improvement in the field, many of these error rates could see a decrease in new models released since our evaluation was conducted.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "der Zeit. Viele dieser Fehlerkoeffizienten könnten zu einer Abnahme in neuen Modellen führen, die bei der Bewertung durchgeführt wurden.", "score": 51.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_676.wav", "doc_id": "oaOHnMCwad.seg_676", "src_text": "This work was done in collaboration with some folks at the University of Washington and the Allen Institute for AI, namely Sebastian Santy, Ronan Le Bras, Katharina Reinecke and Maarten Sap.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese Arbeit wurde in Zusammenarbeit mit einigen der Universität von Washington und dem All Institute for A. Sebastian Santi, Ronin Labas, Caterina R. und Martin S.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_670.wav", "doc_id": "FLkGnzVRew.seg_670", "src_text": "We also find that iterative update is useful for transfer learning from a different domain, whereas in domain active annotations benefit from cumulative update.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir finden außerdem, dass die iterierte Aktualisierung nützlich ist, um aus einer anderen Domäne zu transferieren, wobei die aktiven Anmerkungen von kumulativen Updates profitieren.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_394.wav", "doc_id": "WBLMIsdIrq.seg_394", "src_text": "In this work, we try to answer these two questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Arbeit versuchen wir, diese beiden Fragen zu beantworten:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_826.wav", "doc_id": "WTTtiRKFZI.seg_826", "src_text": "And talk to us about at the poster session.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vereinbarung und die Argumente an, und sprechen Sie uns über", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_864.wav", "doc_id": "GvEBWkLmuI.seg_864", "src_text": "This contributes to a long legacy of discrimination and othering for these groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies trägt zu einer langen Vorgeschichte der Diskriminierung und anderen für diese Gruppen bei.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_271.wav", "doc_id": "PIZEXUFLAR.seg_271", "src_text": "Therefore, this motivates us to build a multi-modal instruction tuning dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "uns das, ein multimodales Anweisungssystem zu erstellen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_692.wav", "doc_id": "oaOHnMCwad.seg_692", "src_text": "We do this through our framework NLPositionality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir tun dies durch unser Framework NLP Positionality.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_843.wav", "doc_id": "GvEBWkLmuI.seg_843", "src_text": "The first one is generating these personas.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Der erste Teil erzeugt diese Personen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_442.wav", "doc_id": "hgIDlKNiFM.seg_442", "src_text": "So we ask ourselves a question about what is the most appropriate data sources for a wide range of usage and those crawled data are good substitution for clinical data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "was die geeignetsten Datenspeicher für eine breite Palette von Anwendungen sind, und diese Crondaten sind gute Ersatzdaten für klinische Daten.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_618.wav", "doc_id": "oeooqChmKK.seg_618", "src_text": "In the Background-Pretrain setting, we assume that the background knowledge \"Politicians seek elected seats in government\" is contained in the pretrained parameters and in inference-time context we provide the entity-specific knowledge \"Chichester is a politician.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Im Hintergrund, im Vorfeld der Ausbildung, nehmen wir an, dass die Hintergrundwissen, die Politiker suchen, um einen Sitz in der Regierung zu erlangen, in den Vorwahlnachrichten enthalten sind. Im Kontext der Nachwirkungen stellen wir die antipolitische Kenntnis dar, dass Chichester ein Politiker ist.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_125.wav", "doc_id": "wLqFAuDnKa.seg_125", "src_text": "It's trained on a large collection of text, comprising 780 billion tokens.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Es wird auf einer großen Sammlung von Texten trainiert, die", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_118.wav", "doc_id": "uZBWfYjYnf.seg_118", "src_text": "And we also see that if we consider the actual elapsed time or the computational-aware time, that is the fastest strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und wir sehen auch, dass, wenn wir die tatsächliche Abstimmungszeit oder die computergestützte Arbeitszeit betrachten, Adat die schnellste Strategie ist.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_547.wav", "doc_id": "rISrKoXQCx.seg_547", "src_text": "Today I'm presenting our work \"From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Heute präsentiere ich unsere Arbeit von der Vorbereitung von Daten bis hin zu Sprachmodellen und dann zu Downstream-Aufgaben, bei denen wir die Spuren politischer Voreingenommenheit verfolgen,", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_546.wav", "doc_id": "rISrKoXQCx.seg_546", "src_text": "Hi, I'm Shangbin, PhD student in the University of Washington.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Heute präsentiere", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_370.wav", "doc_id": "gGbuDbHhyc.seg_370", "src_text": "However, if we allow to continue fine-tuning on the clean samples, then FTw performs equally well as other methods.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn wir jedoch die Möglichkeit haben, die Fine-Tuning der gelöteten Proben fortzusetzen, dann funktioniert F. T. W. genauso gut wie andere Methoden.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_744.wav", "doc_id": "XejEJmgUmE.seg_744", "src_text": "And what we do is that to recreate like longer sequences and which are acceptable and which has the same matching of the grammatical structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und was wir tun, ist, dass wir, um längere Sequenzen zu erstellen, die akzeptabel sind und die gleiche Übereinstimmung der grammatikalischen", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_589.wav", "doc_id": "oeooqChmKK.seg_589", "src_text": "Hello everyone, I'm Akshatha, and today my co-author Martin and I are presenting our work \"The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo alle, ich bin Marten und heute präsentiere ich meine Arbeit „Knowledge Integration from Multiple Sources“, die", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_172.wav", "doc_id": "SLpqvupgvW.seg_172", "src_text": "This is an important problem in conversational systems and also for benchmarking LLMs' entity understanding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies ist ein wichtiges Problem in Konversationsystemen und auch für das Benchmarken des Verständnisses von LLMs-Entitäten.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_800.wav", "doc_id": "WTTtiRKFZI.seg_800", "src_text": "So these two trees only show the length of the crucial dependencies, the ones that are not constant among these two structures.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "So zeigen diese beiden Bäume nur die Länge der entscheidenden Abhängigkeiten, die nicht zwischen diesen beiden Strukturen konstant sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_517.wav", "doc_id": "dvGkKzmIaN.seg_517", "src_text": "However, this method either not applicable to embedding as services or lack of transferability.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Methoden sind jedoch entweder nicht für das Einbetten von Anzeigediensten anwendbar oder es fehlt an Übertragbarkeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_754.wav", "doc_id": "XejEJmgUmE.seg_754", "src_text": "So first, we look at the Wikipedia sentences, which are completely irrelevant to the current query pair, and there we find that the MPP judgments are mostly robust for arbitrary context length.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zunächst betrachten wir die Wikipedia-Sätze, die völlig irrelevant zu der aktuellen Fragepaar sind, und dort finden wir heraus, dass die MP-PP-Urteile für willkürliche Kontextlängen überwiegend robust sind.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_38.wav", "doc_id": "aQpIWggfCo.seg_38", "src_text": "We find CoScript shows high pluralism in the generated specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir finden mit Coscript einen Hypothesentest", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_807.wav", "doc_id": "WTTtiRKFZI.seg_807", "src_text": "Ok.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "herausgefunden?", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_473.wav", "doc_id": "SUkmfOTvGi.seg_473", "src_text": "We then fine-tuned over 20 models on CoNLL-2003.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir haben dann über zwanzig Modelle auf dem Cornu", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_201.wav", "doc_id": "SLpqvupgvW.seg_201", "src_text": "Then, we asked the annotators to pick one of these entities, for example, here's the first one, and describe them using three to five indirect referring expressions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dann bitten wir die Annotatoren, eine dieser Entitäten auszuwählen, z. B. die erste, und sie mit drei bis fünf indirekten Referenzausdrücken zu beschreiben,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_556.wav", "doc_id": "rISrKoXQCx.seg_556", "src_text": "So specifically, we first proposed to prompt language models with different prompt formats using the political questionnaires such as the political conference test.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "schlagen wir speziell vor, Sprachmodelle mit unterschiedlichen Prompt-Formaten zu entwickeln, wobei die politischen Fragen wie der politische Kompass verwendet werden, was", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_160.wav", "doc_id": "SLpqvupgvW.seg_160", "src_text": "I'm going to talk about our work on \"Resolving Indirect Referring Expressions for Entity Selection\", in which we introduce the AltEntities Corpus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ich möchte über unsere Arbeit sprechen, bei der wir indirekte Referenzausdrücke für die Entitätenauswahl lösen, wobei wir den All-Entities-Corpus einführen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_763.wav", "doc_id": "XejEJmgUmE.seg_763", "src_text": "So we did a series of analysis where we tried to perturb the input sentence by, trying to preserve the relevant structure but adding like noise to the input.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben eine Reihe von Analysen durchgeführt, bei denen wir versucht haben, die Eingabe-Satz durch das Hinzufügen von Rauschen zu der Eingabe zu stören, während wir versucht haben, die relevante Struktur zu erhalten, und", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_278.wav", "doc_id": "PIZEXUFLAR.seg_278", "src_text": "In which the input text, images, instructions and bounding boxes are represented in the same token space.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "in dem der Eingabetext, die Bilder, die Anweisungen und die Grenzkästchen im gleichen Tokenraum dargestellt werden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_623.wav", "doc_id": "oeooqChmKK.seg_623", "src_text": "Without task-specific training on KITMUS, both models do not perform well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "einer taskspezifischen Ausbildung auf Kidmus werden beide Modelle nicht gut ausgebildet.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_767.wav", "doc_id": "XejEJmgUmE.seg_767", "src_text": "So, the key takeaways of our work is that language models are sensitive to latent syntactic and semantic features which are shared across the sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Schlüssel zu unserer Arbeit ist, dass Sprachmodelle auf latente syntaktische und semantische Merkmale reagieren, die in allen Sätzen gemeinsam sind.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_781.wav", "doc_id": "WTTtiRKFZI.seg_781", "src_text": "So, we get some dependencies from end to all the conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir erhalten auch Abhängigkeiten von Ende zu Ende. Und", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_464.wav", "doc_id": "SUkmfOTvGi.seg_464", "src_text": "Today I'm going to present our paper Do CoNLL-2003 named entity taggers still work well in 2023?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Heute werde ich unsere Arbeit präsentieren, ob die Cornel 2003 Entity-Taggers noch gut funktionieren. Lass uns", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_413.wav", "doc_id": "WBLMIsdIrq.seg_413", "src_text": "And this allows us to identify phenomena that cannot really be captured by the word itself, but that's rather expressed in the sentence structure, such as ellipses resolution.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und das ermöglicht die Identifizierung eines Phänomens, das nicht wirklich vom Wort selbst erfasst werden kann, sondern eher in der Struktur ausgedrückt wird. Jetzt verwenden wir unsere Erkenntnisse aus", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_760.wav", "doc_id": "XejEJmgUmE.seg_760", "src_text": "But when we match the structure, that is when we choose the sentences from the same phenomena in BLiMP or SyntaxGym, we see a massive increase or a massive decrease of the MPP judgement for the model, depending on whether the chosen prefix is acceptable or unacceptable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Aber wenn wir die Struktur abgleichen, dann wählen wir die Sätze aus den gleichen Phänomenen im Blame-Syntax, Jim. Abhängig davon, ob das gewählte Präfix akzeptabel oder nicht akzeptabel ist, sehen wir einen massiven Anstieg oder eine massive Abnahme des MP-Beurteilung für das Modell.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_511.wav", "doc_id": "dvGkKzmIaN.seg_511", "src_text": "The watermark method need to meet the following properties.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Watermark-Methode muss die folgenden Eigenschaften erfüllen:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_747.wav", "doc_id": "XejEJmgUmE.seg_747", "src_text": "And we can also do the same by choosing sentences from a different subset or a different data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und wir können dasselbe auch tun, indem wir Sätze aus einem anderen Teilmenge oder einem anderen Datensatz auswählen,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_480.wav", "doc_id": "SUkmfOTvGi.seg_480", "src_text": "The second ingredient is the model size.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das zweite Element ist die Modellgröße.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_483.wav", "doc_id": "SUkmfOTvGi.seg_483", "src_text": "Here we also found that more fine tuning examples, actually also leads to better generalization.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "hier haben wir auch festgestellt, dass mehr feinjustierte Beispiele tatsächlich auch zu einer besseren Generalisierung führen.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_672.wav", "doc_id": "FLkGnzVRew.seg_672", "src_text": "Feel free to get in touch with us if you have any questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "„Ich fühle mich frei, mit Ihnen in Kontakt zu treten,", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_584.wav", "doc_id": "rISrKoXQCx.seg_584", "src_text": "And it's incredibly hard to determine what is actually neutral and should be retaining language monitoring data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "es ist unfassbar schwer zu bestimmen, was tatsächlich neutral ist und behalten werden sollte, und daher", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_733.wav", "doc_id": "XejEJmgUmE.seg_733", "src_text": "So the minimal pair paradigm basically evaluates language models on top of acceptability judgments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "bewertet das Minimal-Paarpfadmodell Sprachmodelle im Allgemeinen über Akzeptabilitätsurteile, die auch", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_739.wav", "doc_id": "XejEJmgUmE.seg_739", "src_text": "So it's crucial that we evaluate the models' acceptability throughout the context window and that is what we are trying to do here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "längeren Kontextfenstern heraus, daher ist es entscheidend, dass wir die Modelle aufgrund ihrer Akzeptabilität durch das gesamte Kontextfenster bewerten. Und genau das versuchen wir hier zu tun:", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_190.wav", "doc_id": "SLpqvupgvW.seg_190", "src_text": "The first one is uniform at random.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die erste ist einheitlich.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_298.wav", "doc_id": "PIZEXUFLAR.seg_298", "src_text": "We use one instruction versus 5 instruction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "bei dem wir eine Anweisung mit fünf Anweisungen vergleichen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_646.wav", "doc_id": "FLkGnzVRew.seg_646", "src_text": "We used dissonance-first approach, as seen in the flow chart here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir verwenden den Dissonanz-Erstansatz, wie Sie hier im Diagramm sehen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_213.wav", "doc_id": "SLpqvupgvW.seg_213", "src_text": "Here is a link to our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier ist eine Verbindung zu unserem Datensatz.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_561.wav", "doc_id": "rISrKoXQCx.seg_561", "src_text": "Secondly, we aim to investigate to which extent the political biases of language models are actually picked up from training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zweitens werden wir Wochen investieren, um zu untersuchen, inwieweit sich die politischen Sprachmodelle aus den Trainingsdaten ableiten", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_304.wav", "doc_id": "PIZEXUFLAR.seg_304", "src_text": "We design a new metric called sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir entwerfen außerdem eine neue metrische Sensibilität. Also", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_325.wav", "doc_id": "dJGfOSFgZO.seg_325", "src_text": "For each of the existing methods, we collected evaluations on eight of the most commonly measured aspects of dialogue, since this is the standard practice for evaluating chat models along multiple dimensions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Für jede der bestehenden Methoden haben wir Bewertungen von acht der am häufigsten gemessenen Aspekte des Dialogs, da dies die Standardpraxis für die Bewertung von Chatmodellen über mehrere Dimensionen hinweg ist.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_252.wav", "doc_id": "oYCKgTzTDy.seg_252", "src_text": "While the green line is the Monolingual Setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ohne Schüsse, während die grüne Linie die monolinguale Einstellung ist.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_212.wav", "doc_id": "SLpqvupgvW.seg_212", "src_text": "We've also shown that the models are domain-generalizable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir haben auch gezeigt, dass die Modelle domänenspezifisch sind.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_472.wav", "doc_id": "SUkmfOTvGi.seg_472", "src_text": "This is a data set that we collected from Reuters News from 2020, and then annotated them with the same CoNLL-2003 annotation guidelines.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ein Datensatz ist, den wir aus den Nachrichten von Reuters aus dem Jahr 2020 gesammelt haben, und den wir dann mit den gleichen Anmerkungslinien aus dem Jahr 2020 von Carno annotiert haben.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_604.wav", "doc_id": "oeooqChmKK.seg_604", "src_text": "After a long day at work deciding cases in a law court, he was happy to relax.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "langen Arbeitstag in einem Gerichtssaal.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_459.wav", "doc_id": "hgIDlKNiFM.seg_459", "src_text": "Finally, as a conclusion our proper system offered better performance on nine of the 11 downstream tasks and surpassed globally the result of the generic model, here CamemBERT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Schlussfolgerung, übertrifft unser eigentliches System mit einer besseren Leistung auf neun der elf Downstream-Tasks weltweit das Ergebnis des generischen Modells hier, Camembert.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_40.wav", "doc_id": "aQpIWggfCo.seg_40", "src_text": "We find that T5 fine-tuned on CoScript can generate scripts of higher quality than most large language models, indicating that smaller models can surpass larger models when properly trained on suitable datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "FAN-Sites kann TfF-Font-Oscore Scripte von höherer Qualität generieren als die meisten langsprachigen Modelle, was darauf hindeutet, dass kleinere Modelle größere Modelle unterstützen können, wenn sie richtig auf geeigneten Datensätzen trainiert werden.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_463.wav", "doc_id": "SUkmfOTvGi.seg_463", "src_text": "Hello everyone, my name is Shuheng.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo alle, mein Name ist Xu", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_214.wav", "doc_id": "SLpqvupgvW.seg_214", "src_text": "Thanks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_179.wav", "doc_id": "SLpqvupgvW.seg_179", "src_text": "In the second speech bubble, Alice says, \"Do you mean 'Easy on Me' or 'I Gotta Feeling'?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In dem zweiten Gesprächsballon sagt Alice: „Meinst du ‚leicht‘ auf mich oder ‚ich habe eine Gefühle‘?", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_485.wav", "doc_id": "SUkmfOTvGi.seg_485", "src_text": "The first one is adaptive overfitting, which is overfitting costs by reusing the same test set over and over again and this is usually manifested as the diminishing returns on a new test set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "die erste ist die adaptive Überpassung, die durch das Wiederholte Wiederverwenden des gleichen Testsets verursachte Überpassung. Und dies wird normalerweise durch die Rückkehr der Abnahme auf einem neuen Testset manifestiert.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_242.wav", "doc_id": "oYCKgTzTDy.seg_242", "src_text": "So, regarding analysis of monolingual models, we evaluate on two groups of models including Encoder-PTR which stands for Multilingual Pretrained Encoders with Pointer-based Decoders, such as XLM-R + PTR and mBERT + PTR.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "fest, so dass wir die Analyse von monolingualen Modellen auf zwei Gruppen von Modellen beschränken. Inklusive Encoder-PR, was für mehrsprachige vorgefertigte Encoder mit Zeiger-basierten Dekodern wie XlN+PR und Bert+PR steht. Und", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_364.wav", "doc_id": "gGbuDbHhyc.seg_364", "src_text": "Typically we only need 20 samples per class to attain high performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Normalerweise brauchen wir nur zwanzig Proben pro Klasse, um eine hohe Leistung zu erzielen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_537.wav", "doc_id": "dvGkKzmIaN.seg_537", "src_text": "We conduct experiments on four data sets AG News, MIND, SST2 and Enron Spam.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir führen Experimente mit vier Datensätzen durch: AgNews, Mind, Sst2 und Ares.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_80.wav", "doc_id": "TVCREhgqUP.seg_80", "src_text": "To give you a teaser of the experimental results, here we compare our method with other treeless models on the COGS benchmark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um Ihnen einen Vorgeschmack auf die Ergebnisse der Experimente zu geben, vergleichen wir hier unsere Methode mit anderen Treeless-Modellen auf der Basis des Kogs-Benchmark.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_364.wav", "doc_id": "gGbuDbHhyc.seg_364", "src_text": "Typically we only need 20 samples per class to attain high performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "benötigen wir nur 20 Beispiele pro Klasse, um eine hohe Leistung zu erzielen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_819.wav", "doc_id": "WTTtiRKFZI.seg_819", "src_text": "However, when the governor is on the right, as here, \"laughed\" governs the coordination Ted and Ned, this effect disappears.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "tritt dieser Effekt auf, wenn die Regierung auf der rechten Seite regiert, die Koordinierung von oben nach unten. Daher zeigen wir, dass", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_56.wav", "doc_id": "TVCREhgqUP.seg_56", "src_text": "In contrast to standard machine learning evaluation, the test set does not come from the same distribution but contains structurally unseen logical forms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Im Gegensatz zur standardisierten Maschinenlernfähigkeit, kommt der Testset nicht aus der gleichen Verteilung, sondern enthält strukturelle, anisotrope und logische Formen.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_414.wav", "doc_id": "WBLMIsdIrq.seg_414", "src_text": "So now we use our findings from our analysis to design a benchmark for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "verwenden wir unsere Ergebnisse aus den Analysen, um einen Benchmark für die Dokumenten-Normalisierung zu erstellen.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_828.wav", "doc_id": "GvEBWkLmuI.seg_828", "src_text": "Hi, I'm Myra and today I'll be talking about our paper \"Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo, ich bin Mira, und heute werde ich über Papiermarken sprechen, die natürliche Sprache verwenden, um Typen und Sprachmodelle zu messen. Diese Arbeit wird", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_181.wav", "doc_id": "SLpqvupgvW.seg_181", "src_text": "And in the third speech bubble, Bob uses an indirect reference to select one of these entities, for example, \"the newer one.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Frage, und im dritten Sprachballon verwendet Bob einen direkten Bezug, um eine dieser Entitäten auszuwählen, zum Beispiel die neuere.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_78.wav", "doc_id": "TVCREhgqUP.seg_78", "src_text": "We determine the third token in the output in a similar way by jumping to another multiset token.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir bestimmen den dritten Token im Output in ähnlicher Weise, indem wir zu einem anderen Multiset-Token springen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_403.wav", "doc_id": "WBLMIsdIrq.seg_403", "src_text": "And we perform our analysis on transcripts of TED talks that have been translated from English to 14 different languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir führen unsere Analysen von Transkripten von Ted Talks durch, die aus dem Englischen in vierzehn verschiedene Sprachen übersetzt wurden.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_537.wav", "doc_id": "dvGkKzmIaN.seg_537", "src_text": "We conduct experiments on four data sets AG News, MIND, SST2 and Enron Spam.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir führen Versuche auf vier Datensätzen durch: Agnew, Mind, SSTD2 und Aresam; wir", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_508.wav", "doc_id": "dvGkKzmIaN.seg_508", "src_text": "However, recent works have shown that the attacker may steal the model through learning from the embedding and provide similar services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Jüngste Arbeiten haben jedoch gezeigt, dass der Angreifer das Modell durch das Erlernen der Einfügung und die Bereitstellung ähnlicher Dienste stehlen kann.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_293.wav", "doc_id": "PIZEXUFLAR.seg_293", "src_text": "Here is our main result.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Anpassung von Anweisungen", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_640.wav", "doc_id": "FLkGnzVRew.seg_640", "src_text": "So why does this matter?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Warum ist das wichtig?", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_490.wav", "doc_id": "SUkmfOTvGi.seg_490", "src_text": "So what about temporal drift then?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was ist dann mit der Temperatur? Für", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_83.wav", "doc_id": "TVCREhgqUP.seg_83", "src_text": "In our paper, we solve a couple of interesting technical challenges.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In unserem Papier stellen wir eine Reihe interessanter technischer Herausforderungen dar.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_655.wav", "doc_id": "FLkGnzVRew.seg_655", "src_text": "We find that on transferring the zero-shot performance on the annotated data set is already much better than chance with the best, with AUC .62.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir finden, dass die Leistung bei der Übertragung auf dem annotierten Datensatz bereits viel besser ist als das Zufallsverhalten mit", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_578.wav", "doc_id": "rISrKoXQCx.seg_578", "src_text": "So this has sound the alarm for us to acknowledge and tackle the fairness issues resulting by language model political leanings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "einfach loslaufen, ohne dass es irgendeine Kontrolle gibt, also ist das eine Alarmierung für uns, die wir das erkennen und abstellen", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_353.wav", "doc_id": "gGbuDbHhyc.seg_353", "src_text": "But like an elephant in the room this necessity is often overlooked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "aber wie ein Elefant im Raum wird diese Notwendigkeit oft übersehen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_785.wav", "doc_id": "WTTtiRKFZI.seg_785", "src_text": "Now the aim of this paper is to produce a novel argument for the symmetric structures of coordination, like these two and against the asymmetric structures of coordination, like these two.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das Ziel dieses Papiers ist es nun, ein neues Argument für die symmetrischen Koordinatensysteme dieser Art und gegen die asymmetrischen Koordinatensysteme dieser Art zu liefern.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_4.wav", "doc_id": "aQpIWggfCo.seg_4", "src_text": "And show that large language models can effectively decompose goals into steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "zeigen, dass große Sprachmodelle Ziele effektiv in Schritte zerlegen können.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_754.wav", "doc_id": "XejEJmgUmE.seg_754", "src_text": "So first, we look at the Wikipedia sentences, which are completely irrelevant to the current query pair, and there we find that the MPP judgments are mostly robust for arbitrary context length.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zunächst betrachten wir die Wikipedia-Sätze, die für das aktuelle Fragepaar völlig irrelevant sind, und stellen fest, dass die MPP-Urteile für den freien Kontext in der Regel robust sind.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_32.wav", "doc_id": "aQpIWggfCo.seg_32", "src_text": "However, previous studies do not enable planning for specific goals and manual dataset annotation is expensive.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Allerdings ermöglichen vorherige Studien keine Planung für spezifische Ziele und die manuelle Datensatz-Annotation ist teuer.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_28.wav", "doc_id": "aQpIWggfCo.seg_28", "src_text": "With our method, InstructGPT can generate scripts of higher quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Mit unserer Methode kann Insensibilität zu Schmerzen von höherer Qualität führen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_825.wav", "doc_id": "WTTtiRKFZI.seg_825", "src_text": "So see the paper for the full arguments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sie sich das Papier für die vollständige Übereinkunft", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_153.wav", "doc_id": "wLqFAuDnKa.seg_153", "src_text": "So, in particular, the most common errors are omission errors.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die häufigsten Fehler sind Omissionsfehler, wie es scheint.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_532.wav", "doc_id": "dvGkKzmIaN.seg_532", "src_text": "Back door data set contains sentences of which all words belong to the trigger set while all words in the sentences of benign data set do not belong to the trigger sets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Datenmenge. Das Backdoor-Dataset enthält Sätze, in denen alle Wörter zum Trigger-Set gehören. Während alle Wörter in den Sätzen von \"benign\" nicht zum Trigger-Set gehören.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_753.wav", "doc_id": "XejEJmgUmE.seg_753", "src_text": "So how does the model do?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wie funktioniert das Modell?", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_270.wav", "doc_id": "PIZEXUFLAR.seg_270", "src_text": "However, there is no large-scale publicly-available multi-modal instruction task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "es gibt keine großen öffentlich zugänglichen Multimodalunterrichtsaufgaben,", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_174.wav", "doc_id": "SLpqvupgvW.seg_174", "src_text": "Our data set covers three different domains: music, books, and recipes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unser Datensatz umfasst drei verschiedene Bereiche: Musik, Bücher und Nachrichten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_862.wav", "doc_id": "GvEBWkLmuI.seg_862", "src_text": "First, from our groups, the top words include things like \"culture\", \"tradition\", \"proud\", and \"exotic\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zunächst für Markengruppen, die Top-Wörter beinhalten Dinge wie Kultur, Tradition, Stolz und Exotik,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_379.wav", "doc_id": "gGbuDbHhyc.seg_379", "src_text": "Finally, we have open-sourced our code.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Schließlich haben wir unser Open-Source-Code,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_558.wav", "doc_id": "rISrKoXQCx.seg_558", "src_text": "So some preliminary results demonstrate that first, language models do have varying political leanings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "verankert sind. So zeigen einige vorläufige Ergebnisse, dass erste Sprachmodelle noch unterschiedliche politische Neigungen haben,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_444.wav", "doc_id": "hgIDlKNiFM.seg_444", "src_text": "Afterwards, we ask ourselves how much data do we need to train a specialized model on French data?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Danach fragen wir uns, wie viele Daten wir brauchen, um ein spezielles Modell auf französischen Daten zu trainieren.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_138.wav", "doc_id": "wLqFAuDnKa.seg_138", "src_text": "In our experiments, we settled for a 5-shot prompting strategy where we just marked each sentence that we provide to the system, with the language it's in.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In unseren Experimenten haben wir uns für eine fünf-Schuss-Strategie entschieden, bei der wir die Sätze, die wir dem System zur Verfügung stellen, einfach mit der Sprache markieren.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_751.wav", "doc_id": "XejEJmgUmE.seg_751", "src_text": "Finally, we can choose sentences from a completely unrelated domain such as Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Schließlich können wir Sätze aus einer völlig unabhängigen Domäne wie Wikipedia auswählen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_634.wav", "doc_id": "FLkGnzVRew.seg_634", "src_text": "We begin by defining cognitive dissonance and why it is an important problem to study in language.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "mit der Definition kognitiver Dissonanzen und warum es ein wichtiges Problem ist, in der Sprache zu", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_852.wav", "doc_id": "GvEBWkLmuI.seg_852", "src_text": "So in our method, we first designate what the unmarked and marked groups are, and then we compare the personas using the Fightin’ Words method, which is basically using weighted log-odds ratios to distinguish the top words for each marked group.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "besteht also darin, zuerst zu bestimmen, was die unmarkierten und markierten Gruppen sind. Dann vergleichen wir die Personen, die die Kämpfer-Wörtermethode verwenden, die im Grunde die gewogenen Logos-Rationen verwenden, um die oberen Wörter für jede Markengruppe zu unterscheiden.", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_426.wav", "doc_id": "WBLMIsdIrq.seg_426", "src_text": "So this sort of suggests where we would need to see more progress for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Phänomenen wie Elipsen und Formen verwenden. Also müssen wir mehr Fortschritte in der Dokumentation machen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_301.wav", "doc_id": "PIZEXUFLAR.seg_301", "src_text": "As we can see by transfer learning from natural instruction datasets, the model can achieve much better sensitivity compared to the original OFA model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wie wir sehen, kann das Modell durch das Lernen aus natürlichen Anweisungsdatensätzen eine viel bessere Sensitivität im Vergleich zum ursprünglichen OFA-Modell erzielen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_828.wav", "doc_id": "GvEBWkLmuI.seg_828", "src_text": "Hi, I'm Myra and today I'll be talking about our paper \"Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hi, ich bin Myriam und heute werde ich über unser Paper Marked Personas sprechen, indem wir natürliche Sprachmodelle verwenden, um Stereotypen in Sprachmodellen zu messen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_321.wav", "doc_id": "dJGfOSFgZO.seg_321", "src_text": "ABC-Eval is capable of measuring the rates at which chat models will commit various thematic errors.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ABCEval ist in der Lage, die Preise zu messen, die Chat-Modelle begehen werden, wodurch verschiedene thematische Fehler entstehen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_20.wav", "doc_id": "aQpIWggfCo.seg_20", "src_text": "Previous studies have shown that the output quality of language models falls in high variance, leading to bad performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ist. Vorherige Studien haben gezeigt, dass die Ausgangsqualität von Large Models in hohen Variablen liegt, was zu schlechter Leistung führt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_693.wav", "doc_id": "oaOHnMCwad.seg_693", "src_text": "Our framework works in two main steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unser Rahmen funktioniert in zwei Hauptschritten.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_131.wav", "doc_id": "wLqFAuDnKa.seg_131", "src_text": "We use state-of-the-art, neural MT metrics, and additionally also show expert-based human evaluation results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir verwenden state-of-the-art-Neural-MT-Metriken und zeigen zusätzlich Expertenbasierte Human-Evaluationsergebnisse. Schließlich", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_879.wav", "doc_id": "GvEBWkLmuI.seg_879", "src_text": "Have a good time at ACL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Aufmerksamkeit. Ich hoffe, Sie haben einen schönen Tag an der ACL.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_266.wav", "doc_id": "PIZEXUFLAR.seg_266", "src_text": "However, most previous works on instruction tuning focused on improving the zero-shot performance on language only tasks, while computer vision and multi-modal tasks have been left out.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die meisten früheren Arbeiten zur Anweisungsabstimmung konzentrierten sich jedoch auf die Verbesserung der serialisierten Leistung bei Sprachaufgaben, während Computervision- und Multimode-Aufgaben außen vor gelassen wurden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_115.wav", "doc_id": "uZBWfYjYnf.seg_115", "src_text": "And we compare also with the state-of-the-art architecture specifically tailored for simultaneous pre-translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und wir vergleichen sie auch mit dem Zustand der Kunstarchitektur, die speziell für die simultane Übersetzung geeignet ist.", "score": 42.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_312.wav", "doc_id": "dJGfOSFgZO.seg_312", "src_text": "So let's say that you just developed a dialogue model and you want to see how well it compares against the current state-of-the-art.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Sie es uns sagen, dass Sie gerade ein Dialogmodell entwickelt haben und sehen möchten, wie gut es dem aktuellen Zustand der Kunst entspricht.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_334.wav", "doc_id": "dJGfOSFgZO.seg_334", "src_text": "For example, the bots we tested have common sense violations in around 20% of their responses.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel haben die Roboter, die wir getestet haben, in etwa 20 ihrer Antworten Verstöße gegen das gängige Verständnis.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_319.wav", "doc_id": "dJGfOSFgZO.seg_319", "src_text": "We call this approach annotating behaviors in chat or ABC-Eval in short.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "bestimmte Verhaltensweisen ausdrückt oder nicht.", "score": 18.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_256.wav", "doc_id": "oYCKgTzTDy.seg_256", "src_text": "Pretraining on English natural language can significantly boost the performance of Few-shot on target natural languages, and we found multilingual language models such as Codex and BLOOM are still inadequate for cross-lingual semantic parsing tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die Leistung von Target-Sprachen erheblich steigern können. Und wir finden mehrsprachige Sprachmodelle wie Codas in Blau, die immer noch für mehrere Sprachen geeignet sind.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_293.wav", "doc_id": "PIZEXUFLAR.seg_293", "src_text": "Here is our main result.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier sind unsere wichtigsten Ergebnisse.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_278.wav", "doc_id": "PIZEXUFLAR.seg_278", "src_text": "In which the input text, images, instructions and bounding boxes are represented in the same token space.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "in dem Eingabetext, Bilder, Anweisungen und Begrenzungsfelder im selben Tokenraum dargestellt werden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_621.wav", "doc_id": "oeooqChmKK.seg_621", "src_text": "We evaluate the data set both with human study participants, and established coreference resolution models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir bewerten das Datensatz sowohl mit menschlichen Studienteilnehmern als auch mit eingerichteten Gegenüberstellungsmodellen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_243.wav", "doc_id": "oYCKgTzTDy.seg_243", "src_text": "And, we also evaluate Encoder-Decoder models, which is Multilingual Pretrained Encoder-Decoder Models, such as mBART and mT5.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "bewerten auch Kodierer-Modelle, die mehrsprachig geschult sind, wie z. B. Bart und Tf Five.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_39.wav", "doc_id": "aQpIWggfCo.seg_39", "src_text": "With CoScript we can try smaller but specialized models for constrained language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Mit CoScript können wir kleinere, aber spezialisierte Modelle für die konsistente Sprachplanung trainieren.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_100.wav", "doc_id": "uZBWfYjYnf.seg_100", "src_text": "So what is our solution?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was ist die Lösung?", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_709.wav", "doc_id": "oaOHnMCwad.seg_709", "src_text": "For example, we find that data sets and models are most aligned to English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "stellen fest, dass es sich um", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_460.wav", "doc_id": "hgIDlKNiFM.seg_460", "src_text": "We are also observing that more specialized data is better, but it doesn't scale well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "spezialisierte Daten besser sind, mehr spezialisierte Daten sind besser, aber es passt nicht gut.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_332.wav", "doc_id": "dJGfOSFgZO.seg_332", "src_text": "These reliable, informative, and distinct ABC-Eval metrics enable us to evaluate conversational AI with a higher resolution than previous methods are able to achieve.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese zuverlässigen, informativen und differenzierten ABC-Einheiten ermöglichen es uns, eine höhere Auflösung zu erzielen als die vorherigen Methoden. In", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_813.wav", "doc_id": "WTTtiRKFZI.seg_813", "src_text": "But what's novel in this paper is that we observed that this tendency only occurs when the governor is on the left or absent.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "aber neu in dieser Arbeit ist, dass Wir haben beobachtet, dass diese Tendenz nur dann auftritt, wenn der Gouverneur links abwesend ist also", "score": 56.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_131.wav", "doc_id": "wLqFAuDnKa.seg_131", "src_text": "We use state-of-the-art, neural MT metrics, and additionally also show expert-based human evaluation results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir verwenden hochmoderne Neuro-MT-Metriken und zeigen zusätzlich auch Experten-basierte Human-Evaluierungsergebnisse. Schließlich", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_172.wav", "doc_id": "SLpqvupgvW.seg_172", "src_text": "This is an important problem in conversational systems and also for benchmarking LLMs' entity understanding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies ist ein wichtiges Problem in Konversations- und Benchmarking-Systemen. LLMs-Entitätenverständnis: Wir sind uns nicht bewusst, dass es", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_798.wav", "doc_id": "WTTtiRKFZI.seg_798", "src_text": "But it's also OK to say, \"Marge read yesterday this absolutely fascinating book about bees.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Aber es ist auch in Ordnung zu sagen, dass March gestern dieses absolut faszinierende Buch über Bienen", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_303.wav", "doc_id": "PIZEXUFLAR.seg_303", "src_text": "So overall, we propose the first large scale multi-model instruction tuning dataset with significantly improved their short capability of OFA, and we explore different transfer learning technique and show their benefits.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Im Großen und Ganzen schlagen wir ein erstes groß angelegtes multimodales Anpassungssystem vor, das die Fähigkeiten von OFI erheblich verbessert und wir untersuchen verschiedene Transferlernmethoden und zeigen ihre Vorteile.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_317.wav", "doc_id": "dJGfOSFgZO.seg_317", "src_text": "However, we believe there is a more precise and reliable strategy for dimensional dialogue evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir glauben jedoch, dass es eine präzisere und verlässlichere Strategie für die dimensionale Dialogbewertung gibt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_79.wav", "doc_id": "TVCREhgqUP.seg_79", "src_text": "We continue this process until every token from the first stage has been visited exactly once.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir setzen diesen Prozess fort. Und bis jetzt wurde jeder Token aus der ersten Stufe genau einmal besucht.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_144.wav", "doc_id": "wLqFAuDnKa.seg_144", "src_text": "The summary of our experimental results is that the example quality is more important than the similarity to the source sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zusammenfassung unserer experimentellen Ergebnisse ist, dass die Qualität des Beispiels wichtiger ist als die Ähnlichkeit zum Quatsch. Es", "score": 19.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_331.wav", "doc_id": "dJGfOSFgZO.seg_331", "src_text": "On the other hand, the combination of all turn-level Likert metrics explains far less of the quality, and fewer of these metrics carry unique information.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "der anderen Seite spiegelt die Kombination der alternativen Likert-Skalen die Qualität weit weniger genau wider, und nur wenige dieser Skalen liefern eindeutige Informationen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_555.wav", "doc_id": "rISrKoXQCx.seg_555", "src_text": "Secondly, how do language models with different political leanings actually perform on downstream tasks and whether that might result in fairness issues in NLP applications?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Sprachmodelle tatsächlich funktionieren, und drittens, ob das Ergebnis bei der Verwendung von LP-Anwendungen korrekt ist. So spezifizieren", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_478.wav", "doc_id": "SUkmfOTvGi.seg_478", "src_text": "The first one is the model architecture.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "erste ist die Modellarchitektur.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_299.wav", "doc_id": "PIZEXUFLAR.seg_299", "src_text": "As we can see, using more instructions can improve the model's overall performance and reduce its sensitivity a lot.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wir sehen können, dass die Verwendung mehrerer Anweisungen die Leistung des Modells verbessern und seine Sensitivität verringern kann.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_789.wav", "doc_id": "WTTtiRKFZI.seg_789", "src_text": "So \"Marge read it yesterday\" is fine because the direct object is close to the verb, while \"Marge read yesterday it\" is much worse.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "So ist es richtig, weil das „direct object“ dem Verb näher ist. Während es gestern Abend noch okay war, ist es heute viel schlimmer, weil", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_22.wav", "doc_id": "aQpIWggfCo.seg_22", "src_text": "We first show constraint types with examples for InstructGPT and obtain specific goals based on the seed abstract goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zuerst zeigen wir konstruierte Typen mit Beispielen für intrinsische PT und erhalten spezifische Ziele auf der Grundlage der gesagten abstrakten Ziele.", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_808.wav", "doc_id": "WTTtiRKFZI.seg_808", "src_text": "So what we did, we extracted various statistics about coordination from the enhanced version of the Penn Treebank and see the paper \"Why wouldn't you use universal dependencies\" and these statistics confirm the observation made many times before that left conjuncts tend to be shorter.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir haben viele Statistiken über Koordination aus der erweiterten Version der Penetree Bank extrahiert und sehen das Papier, warum wir keine Universitätsabhängigkeiten benutzt haben. Und diese Statistiken bestätigen die Beobachtung, die viele Male vorher gemacht wurde, dass linke Konjugate tendieren, kürzer zu", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_283.wav", "doc_id": "PIZEXUFLAR.seg_283", "src_text": "In addition, we randomly sample 20 tasks from the test split of natural instructions as an unseen task for NLP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "jede Aufgabe, zusätzlich wählen wir zufällig eine Aufgabe aus dem Test der natürlichen Anweisung aus.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_699.wav", "doc_id": "oaOHnMCwad.seg_699", "src_text": "In Live in the Wild is an online experimentation platform where we can recruit divers volunteers.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Lab in the Wild handelt es sich um eine Online-Experimentierplattform, auf der wir diverse Freiwillige rekrutieren können,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_505.wav", "doc_id": "dvGkKzmIaN.seg_505", "src_text": "Currently, large language models such as GPT, LLAMA, PALM are exceptional in natural language understanding and generation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Watermark Lassen Sie uns zunächst den Hintergrund über die Einbettung von AT-Servicen vorstellen. Derzeit sind große Sprachmodelle wie tpt, lama, palm in der natürlichen Sprachverständigung und -erzeugung außerordentlich.", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_446.wav", "doc_id": "hgIDlKNiFM.seg_446", "src_text": "To answer this question, we first train and compare four from-scratch models: a first version of DrBERT, with 7 GB of NACHOS; a second version of 4 GB of set of NACHOS; a first version of ChuBERT, which is a clinical model with 4 GB of sentences taken from clinical notes; and a final version of ChuBERT with a mix of 4 GB of set of NACHOS and 4 GB of clinical notes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "oder mehr. Zu dieser Frage: Wir werden die erste Version und die zweite Version mit sieben Gigabytes von nativen Anwendungen erstellen. Die erste Version von Shubert, die ein klinisches Modell ist, mit vier Gigabyte Sätze aus klinischen Notizen, und die letzte Version von Shubert, mit vier Gigabyte Sätze aus klinischen Notizen.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_599.wav", "doc_id": "oeooqChmKK.seg_599", "src_text": "We evaluate the data set with human study participants and established coreference resolution models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir bewerten den Datensatz mit Human Study-Teilnehmern und einem etablierten Korrelationsmodell Hier ist", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_722.wav", "doc_id": "oaOHnMCwad.seg_722", "src_text": "And a good example of this is the Masakhani initiative.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und ein gutes Beispiel dafür", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_832.wav", "doc_id": "GvEBWkLmuI.seg_832", "src_text": "They usually rely on hand-constructed data sets that are very time-consuming to curate and they also usually only. measure very specific stereotypes, meaning that they don't generalize well to other demographics or contexts, or they simply capture very general broad associations, like negative associations with particular groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Sie stützen sich in der Regel auf handelskonstruierte Datensätze, die sehr viel Zeit in der Kuration verbrauchen. Und sie verwenden auch normalerweise nur sehr spezifische Stereotypen, die bedeuten, dass sie sich nicht allgemein an andere Demographien oder Kontakte heranwenden, und sie fassen sich einfach sehr allgemeine breite Assoziationen ein, wie negative Assoziationen mit bestimmten Gruppen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_607.wav", "doc_id": "oeooqChmKK.seg_607", "src_text": "First, entity-specific knowledge such as \"Servin is a judge.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "erstens spezifische Kenntnisse der Einheit, z. B.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_588.wav", "doc_id": "rISrKoXQCx.seg_588", "src_text": "Thank you for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "danke für deine Zeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_546.wav", "doc_id": "rISrKoXQCx.seg_546", "src_text": "Hi, I'm Shangbin, PhD student in the University of Washington.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ich bin ein Student der Universität von Washington", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_272.wav", "doc_id": "PIZEXUFLAR.seg_272", "src_text": "Here we present MultiInstruct, the first multi-modal instruction tuning benchmark dataset that consists of 62 diverse multi-modal tasks covering 10 broad categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier präsentieren wir Multi-Instruction, das erste Benchmark-Set für die Anpassung von Multi-Modellen, das aus zweiundsechzig verschiedenen Multi-Modellen besteht, die zehn Kategorien abdecken.", "score": 41.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_62.wav", "doc_id": "TVCREhgqUP.seg_62", "src_text": "This works well, but trees are usually not given and need to be obtained somehow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies funktioniert gut, aber Bäume werden normalerweise nicht gegeben und müssen auf eine Weise erhalten werden.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_163.wav", "doc_id": "SLpqvupgvW.seg_163", "src_text": "Consider this alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Betrachten Sie diese alternative", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_406.wav", "doc_id": "WBLMIsdIrq.seg_406", "src_text": "And this allows us to find, for example, dual pronouns in Arabic that have relatively high P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und dies ermöglicht das Auffinden von Beispielen für arabische Pronomen, die eine hohe Häufigkeit aufweisen, und", "score": 44.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_276.wav", "doc_id": "PIZEXUFLAR.seg_276", "src_text": "Here we show some example instances from our MultiInstruct dataset, to unify the processing of various input and output data types.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier zeigen wir einige Beispiele aus unserem Multiversum-Datensatz. Um die Verarbeitung verschiedener Eingabe- und Ausgabedatentypen zu vereinheitlichen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_350.wav", "doc_id": "gGbuDbHhyc.seg_350", "src_text": "In recent works in WSL, so WSL stands for Weakly Supervised Learning, a common claim is that people say that they only train models on the weakly labeled data and achieve high performance on clean test sets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In jüngsten Arbeiten in Wsl steht „Wsl“ für „weekly superwise learning“. Eine häufige Behauptung ist, dass Menschen nur Modelle unter wöchentlichen Daten trainieren und hohe Leistungen erzielen können.", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_436.wav", "doc_id": "hgIDlKNiFM.seg_436", "src_text": "Then, we present our results on 11 biomedical and clinical downstream tasks in French.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "dann präsentieren wir unsere Ergebnisse zu elf biomedizinischen und klinischen Aufgaben in Französisch. Schließlich", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_531.wav", "doc_id": "dvGkKzmIaN.seg_531", "src_text": "We first construct a back door and a benign data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir bauen zunächst eine Hintertür und ein Böses", "score": 39.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_871.wav", "doc_id": "GvEBWkLmuI.seg_871", "src_text": "So rather than actually working towards changing those obstacles, it puts pressure on those people to overcome them, which leads to a very negative health outcomes for these people, among other harms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Denn das aktive Arbeiten an der Veränderung dieser Probleme übt Druck auf diese Personen aus, was sehr negative gesundheitliche Auswirkungen für diese Personen", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_354.wav", "doc_id": "gGbuDbHhyc.seg_354", "src_text": "The aforementioned doubt is asked to ask three research questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der oben genannte Zweifel gibt uns Anlass, drei Forschungsfragen", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_249.wav", "doc_id": "oYCKgTzTDy.seg_249", "src_text": "We also compare the cross-language performance gap.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir vergleichen auch die Cross-Lingual-Performances.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_329.wav", "doc_id": "dJGfOSFgZO.seg_329", "src_text": "Finally, we checked whether each evaluation metric captures a unique aspect of chat quality using a stepwise linear regression.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Schließlich haben wir überprüft, ob jede Bewertungsmetrik einen einzigartigen Aspekt der Überprüfung der Qualität mit einer schrittweisen linearen Regression erfasst.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_775.wav", "doc_id": "WTTtiRKFZI.seg_775", "src_text": "A similar approach is assumed in Igor Mel'čuk's meaning text theory, where again, the whole coordinate structure is headed by the first conjuct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Texttheorie besagt, dass die gesamte Struktur durch den ersten Knotenpunkt bestimmt wird,", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_254.wav", "doc_id": "oYCKgTzTDy.seg_254", "src_text": "We also find some other interesting findings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir finden auch einige andere interessante Ergebnisse,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_413.wav", "doc_id": "WBLMIsdIrq.seg_413", "src_text": "And this allows us to identify phenomena that cannot really be captured by the word itself, but that's rather expressed in the sentence structure, such as ellipses resolution.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zentrum, und dies ermöglicht es uns, ein Phänomen zu identifizieren, das nicht wirklich von der Welt selbst erfasst werden kann, aber in einer zentralen Struktur eher ausgedrückt wird, so wie etwa eine Lösung. Daher", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_563.wav", "doc_id": "rISrKoXQCx.seg_563", "src_text": "By further pretraining language models on such partisan corpora we can see that the ideological coordinates of the language model also correspondingly shift.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Teile und Korpora untersuchen, können wir sehen, dass die ideologischen Koordinaten des Sprachmodells ebenfalls korrespondieren.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_474.wav", "doc_id": "SUkmfOTvGi.seg_474", "src_text": "We evaluated them on both the CoNLL-03 test sets and the CoNLL++.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir bewerteten sie sowohl auf dem Corolla-Testset als auch auf dem Corolla-Plus-Testset. Zuletzt, aber", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_18.wav", "doc_id": "aQpIWggfCo.seg_18", "src_text": "We dig into a more fine-grained topic categories of constraints defined in wikiHow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir gehen tiefer in die feineren topologischen Kategorien von Einschränkungen ein, die in der Wirklichkeit definiert sind.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_42.wav", "doc_id": "aQpIWggfCo.seg_42", "src_text": "We evaluate constrained language planning ability of large language models and develop an over-generate-then-filter method for large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "eingeschränkte Sprachplanungskompetenz von Großsprachenmodellen und entwickeln eine übergenerierende Filtermethode für Großsprachenmodelle.", "score": 19.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_837.wav", "doc_id": "GvEBWkLmuI.seg_837", "src_text": "And we can immediately see that this is very generalizable to any demographic because we can just specify whatever identity marker that we want into this prompt.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und wir können sofort sehen, dass dies sehr allgemein auf jede Demografie anwendbar ist, weil wir einfach nur spezifizieren können, welche Identitätsmarker wir in diesen Prompt einbeziehen möchten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_302.wav", "doc_id": "PIZEXUFLAR.seg_302", "src_text": "We also can see transfer learning from natural instruction datasets can help OFA to attain much better performance on the natural instruct dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir können auch sehen, dass das Transferlernen aus dem Natural-Instruction-Dataset dem OWA helfen kann, auf dem Natural-Instruction-Dataset viel bessere Leistung zu erzielen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_231.wav", "doc_id": "oYCKgTzTDy.seg_231", "src_text": "And for example, we train the English model on English query and during inference we translate the German query using API to English and then use the trained model to predict the SQL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel trainieren wir das englische Modell auf englischen Anfragen und während der Inferenz übersetzen wir die deutsche Anfrage mit Hilfe der API in englisch und verwenden dann das trainierte Modell, um die Fortsetzung vorherzusagen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_250.wav", "doc_id": "oYCKgTzTDy.seg_250", "src_text": "In this figure, the blue line is Cross-lingual Few-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In dieser Abbildung ist die blaue Linie die Cross-Lingual-Feature-Transfer, die", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_239.wav", "doc_id": "oYCKgTzTDy.seg_239", "src_text": "We train on one source language and transfer to another language.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "zwischen einer Quellensprache und einer Ziel-Sprache.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_835.wav", "doc_id": "GvEBWkLmuI.seg_835", "src_text": "So we can ask the model to generate a persona, which is a depiction of an imagined individual using a prompt like \"Imagine you are an Asian woman.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "können wir das Modell bitten, eine Persona zu generieren, die eine Darstellung eines imaginären Individuums ist, die mit einem Prompt wie \"Stell dir vor, du bist eine asiatische Frau,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_110.wav", "doc_id": "uZBWfYjYnf.seg_110", "src_text": "This means that these three words will be emitted.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das bedeutet, dass diese drei Worte ausgegeben werden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_288.wav", "doc_id": "PIZEXUFLAR.seg_288", "src_text": "In each experiment, we report the min and max performance and the standard deviation of the performance across all 5 experiments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "in jedem Experiment bewerten. Wir berichten über die Mean und Max-Leistung und die Standardabweichung der Leistung in allen fünf Experimenten.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_679.wav", "doc_id": "oaOHnMCwad.seg_679", "src_text": "Where prospective API is able to detect correctly toxic instances.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wo die Perspektiv-API in der Lage ist, toxische Instanzen korrekt zu erkennen,", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_304.wav", "doc_id": "PIZEXUFLAR.seg_304", "src_text": "We design a new metric called sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir sammeln eine größere Menge an Anweisungen", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_407.wav", "doc_id": "WBLMIsdIrq.seg_407", "src_text": "And this can be explained because English doesn't have dual pronouns, so you need context to determine if a pronoun is dual when translating into Arabic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und dies kann erklärt werden, weil Englisch keine Dualpronomen hat, also müssen Sie den Kontext bestimmen, ob ein Pronomen dual ist, wenn es in Arabisch übersetzt wird.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_216.wav", "doc_id": "oYCKgTzTDy.seg_216", "src_text": "Today I'm going to present our work \"XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Heute werde ich unsere Arbeit präsentieren: exemplarisch, semantische Übersetzungen in mehrere natürliche Sprachen und viele Darstellungen. Die", "score": 44.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_366.wav", "doc_id": "gGbuDbHhyc.seg_366", "src_text": "The right figure shows the performance difference between fine-tuning approaches, which are directly applied on the clean data, and WSL approaches, which use the clean data for validation only.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die rote Figur zeigt den Leistungsunterschied zwischen Fine-Tuning-Ansätzen, die direkt auf saubere Daten angewendet werden, und WSL-Ansätzen, die die sauberen Daten nur zur Validierung verwenden. Wir", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_330.wav", "doc_id": "dJGfOSFgZO.seg_330", "src_text": "You can see how the combination of all ABC-Eval metrics explains over 25% of conversation quality, and as you remove the metrics one at a time, most of them result in losing a decent amount of information about the quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "können sehen, wie die Kombination aller ABCE-Werte über fünfundzwanzig Prozent der Gesprächsqualität ausmacht, und wenn Sie die Werte entfernen, sind die meisten Ergebnisse eine vernünftige Menge an Informationen über die Qualität. Auf", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_574.wav", "doc_id": "rISrKoXQCx.seg_574", "src_text": "Similar trends also happen for fake news detection, where we see that left-leaning language models are better at detecting misinformation from their opposite political leaning and vice versa.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ähnliche Trends gibt es auch für die Fake-News-Erkennung, wo wir sagen, dass die Modelle der linken Sprache besser sind, um Fehlinformationen von der anderen Seite zu erkennen. Dies zeigt, wie", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_188.wav", "doc_id": "SLpqvupgvW.seg_188", "src_text": "Here are the different sampling methods we've used.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier sind die unterschiedlichen Probiermethoden, die wir verwendet haben:", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_401.wav", "doc_id": "WBLMIsdIrq.seg_401", "src_text": "We can think of words that have high P-CXMI as ones that require context for translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir können Wörter denken, die hohe PSI haben, als Wörter, die für die Übersetzung einen Kontext benötigen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_844.wav", "doc_id": "GvEBWkLmuI.seg_844", "src_text": "Our prompts to generate these personas were inspired by a study where they gave these prompts to human subjects, finding that by giving it to human subjects, they also were able to surface racial stereotypes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unsere Versuche, diese Personen zu erschaffen, wurden von einer Studie inspiriert, in der diese Versuche an menschliche Subjekte gegeben wurden, wobei festgestellt wurde, dass sie auch in der Lage waren, menschliche", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_410.wav", "doc_id": "WBLMIsdIrq.seg_410", "src_text": "And this helps us identify cases like the one here, where in Chinese you need context to translate proper nouns to make sure that you're using the same translation within the document.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und das hilft bei der Identifizierung von Fällen wie dem hier, in chinesischen Dokumenten müssen Sie den Kontext kennen, um sicherzustellen, dass Sie die gleiche Übersetzung verwenden", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_207.wav", "doc_id": "SLpqvupgvW.seg_207", "src_text": "If the language model has access to the exact same background knowledge as the annotators, then the accuracy is really high, it's around 92 to 95%.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn das Sprachmodell Zugang zu den exakten gleichen Hintergrundwissen wie die Annotatoren hat, ist die Genauigkeit wirklich hoch, sie liegt bei 92 bis 95%,", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_411.wav", "doc_id": "WBLMIsdIrq.seg_411", "src_text": "And similarly, we find that context is important to translate in the right formality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und ebenso finden wir, dass die Kontraste in der richtigen Form dargestellt werden.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_103.wav", "doc_id": "uZBWfYjYnf.seg_103", "src_text": "And leverage the knowledge already acquired by the model through the attention mechanism between audio input and textual output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und das Wissen, das Sie bereits über die Mechanismen der Aufmerksamkeitssteuerung haben, ist ein Beispiel", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_479.wav", "doc_id": "SUkmfOTvGi.seg_479", "src_text": "Through our experiments we found that the transformer models normally generalize better to new data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Durch unsere Experimente haben wir festgestellt, dass sich die Transformatormodelle normalerweise besser auf neue Daten verallgemeinern.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_765.wav", "doc_id": "XejEJmgUmE.seg_765", "src_text": "Basically, we find that the models are sensitive to the perturbed sentences in similar ways.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "judgment trend. Im Wesentlichen stellen wir fest, dass die Modelle in ähnlicher Weise auf die gestörten Sätze reagieren:", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_2.wav", "doc_id": "aQpIWggfCo.seg_2", "src_text": "In everyday life, humans often plan their actions by following step-by-step instructions in the form of goal-oriented scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Im Alltagsleben planen Menschen ihre Handlungen häufig durch Schritt-für-Schritt-Anweisungen in der Form von Zielvorstellungen. In der", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_702.wav", "doc_id": "oaOHnMCwad.seg_702", "src_text": "Afterwards to stay engaged in the study, they can compare their responses to an AI and others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Anschließend können sie sich weiter an der Studie beteiligen, indem sie ihre Antworten mit denen einer AI und anderen vergleichen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_803.wav", "doc_id": "WTTtiRKFZI.seg_803", "src_text": "So instead of 11, 6 is much shorter.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sein, also etwas mehr als elf, daher", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_791.wav", "doc_id": "WTTtiRKFZI.seg_791", "src_text": "Because here between the verb and the direct object is an adjunct: \"yesterday\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "hier zwischen dem Verb und dem direkten Objekt ein Akkusativum vorkommt.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_185.wav", "doc_id": "SLpqvupgvW.seg_185", "src_text": "We always use a simple template.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir verwenden immer ein einfaches Template:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_66.wav", "doc_id": "TVCREhgqUP.seg_66", "src_text": "In this paper, we don't use trees and introduce a neural seq2seq model that directly models the correspondences between fragments of the input and fragments of the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Papier verwenden wir keine Triebe und führen ein jährliches Sequenz-Modell ein, das die Korrespondenzen zwischen Fragmenten des Eingangs und Fragmenten des Ausgangs direkt modelliert.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_718.wav", "doc_id": "oaOHnMCwad.seg_718", "src_text": "So we have a few recommendations for this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir haben also ein", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_619.wav", "doc_id": "oeooqChmKK.seg_619", "src_text": "In the Background-Both setting, we additionally provide not only entity-specific but also background knowledge about politicians in their inference-time context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir zusätzlich nicht nur spezifische, sondern auch allgemeine Hintergrundinformationen. Im Hintergrund ist ein Politiker", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_332.wav", "doc_id": "dJGfOSFgZO.seg_332", "src_text": "These reliable, informative, and distinct ABC-Eval metrics enable us to evaluate conversational AI with a higher resolution than previous methods are able to achieve.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese zuverlässigen, informativen und unverwechselbaren ABC-EVLM-Metriken ermöglichen es uns, die konversationelle AI mit einer höheren Auflösung zu bewerten, als es frühere Methoden ermöglicht haben.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_694.wav", "doc_id": "oaOHnMCwad.seg_694", "src_text": "The first step is to re annotate data sets with diverse annotators.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "1 das Wiederauswerten von Daten mit verschiedenen Annotatoren und 2 das", "score": 37.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_505.wav", "doc_id": "dvGkKzmIaN.seg_505", "src_text": "Currently, large language models such as GPT, LLAMA, PALM are exceptional in natural language understanding and generation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Gegenwärtig sind große Sprachmodelle wie Tpt, Lama, Palm außergewöhnlich im natürlichen Sprachverständnis und in der Sprachgenerierung. Embedding", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_374.wav", "doc_id": "gGbuDbHhyc.seg_374", "src_text": "Our concrete recommendations for future work are as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unsere konkreten Empfehlungen für zukünftige Arbeit lauten wie folgt.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_358.wav", "doc_id": "gGbuDbHhyc.seg_358", "src_text": "We addressed these research questions in our work and our findings are as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir behandeln diese Forschungsfragen in unserer Arbeit und unsere Ergebnisse sind wie folgt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_185.wav", "doc_id": "SLpqvupgvW.seg_185", "src_text": "We always use a simple template.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir verwenden immer eine einfache Vorlage:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_188.wav", "doc_id": "SLpqvupgvW.seg_188", "src_text": "Here are the different sampling methods we've used.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier sind die unterschiedlichen Stichprobenmethoden, die wir verwenden. Wenn", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_715.wav", "doc_id": "oaOHnMCwad.seg_715", "src_text": "An example of this is that datasets and models are less aligned to non binary people compared to the men and women counterparts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ein Beispiel hierfür ist, dass die Datensätze der Modelle nicht nur für Männer und Frauen, sondern auch für die Gleichstellung von Männern und Frauen ausgelegt sind.", "score": 17.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_357.wav", "doc_id": "gGbuDbHhyc.seg_357", "src_text": "Finally, should we only use the clean samples for validation, or there are better ways to utilize them?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Schließlich sollten wir nur die sauberen Beispiele für die Validierung verwenden oder gibt es bessere Möglichkeiten, sie zu nutzen?", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_850.wav", "doc_id": "GvEBWkLmuI.seg_850", "src_text": "So when people are describing a warrior who is a woman, they'll usually actually specify \"woman warrior\" and mark the term with \"woman\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "so dass Personen, die Krieger sind, normalerweise als „Mann-Krieger“ und „Frau-Kriegerin“ bezeichnet werden. Und mehr", "score": 41.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_555.wav", "doc_id": "rISrKoXQCx.seg_555", "src_text": "Secondly, how do language models with different political leanings actually perform on downstream tasks and whether that might result in fairness issues in NLP applications?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sich die Sprachmodelle unterscheiden, werden die politischen Strömungen tatsächlich in den L-Anwendungen dargestellt. So", "score": 7.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_634.wav", "doc_id": "FLkGnzVRew.seg_634", "src_text": "We begin by defining cognitive dissonance and why it is an important problem to study in language.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir beginnen mit der Definition kognitiver Diskrepanzen und warum es ein wichtiges Problem ist, in der Sprache zu", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_531.wav", "doc_id": "dvGkKzmIaN.seg_531", "src_text": "We first construct a back door and a benign data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zuerst erstellen wir einen Backdoor und einen günstigen Datensatz.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_763.wav", "doc_id": "XejEJmgUmE.seg_763", "src_text": "So we did a series of analysis where we tried to perturb the input sentence by, trying to preserve the relevant structure but adding like noise to the input.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir haben also eine Reihe von Analysen durchgeführt, bei denen wir versuchten, die Eingabensatzung zu stören, indem wir versuchten, die entsprechende Struktur zu erhalten, aber die Eingabensatzung mit einem", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_812.wav", "doc_id": "WTTtiRKFZI.seg_812", "src_text": "So the proportion is bigger of the left short conjunct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "der Anteil größer als der der kürzeren Konjunkt. Aber was in", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_612.wav", "doc_id": "oeooqChmKK.seg_612", "src_text": "First, we have the typical setting: \"Background-Pretrain\", where background knowledge is assumed to be available at pretrain time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zunächst die typische Einstellung „Rückwärts-Vorwärts-Training“, bei der die Rückwärtswissen während des Vorwärtstrainings verfügbar sein soll.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_301.wav", "doc_id": "PIZEXUFLAR.seg_301", "src_text": "As we can see by transfer learning from natural instruction datasets, the model can achieve much better sensitivity compared to the original OFA model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wie wir durch das Übertragen von Datensätzen aus der natürlichen Anweisung sehen können, dass das Modell viel empfindlicher ist als das ursprüngliche OFA-Modell.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_24.wav", "doc_id": "aQpIWggfCo.seg_24", "src_text": "Next, a filter model is developed to select the faithful scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Als nächstes wird ein Filtermodell entwickelt, um die visuellen Skripte auszuwählen.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_399.wav", "doc_id": "WBLMIsdIrq.seg_399", "src_text": "And this is done by measuring how much information the context C provides about the target Y, given the source X. You can think of CXMI as the information gained from giving context to the model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "dies misst, wie viel Informationen der Kontext X bereitstellt. Sie können denken, dass die Informationen, die Sie aus dem Modell erhalten, von den Kontakten stammen, die Sie dem Modell gegeben haben.", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_833.wav", "doc_id": "GvEBWkLmuI.seg_833", "src_text": "Furthermore, most work in this space doesn't account for intersectionality, which is the notion that multi-faceted social identities can compound biases and be unique loci of harm.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Weitere, die am häufigsten in diesem Raum vorkommen, sind nicht auf Interaktivität zurückzuführen, was die Vorstellung ist, dass multifunktionale soziale Identitäten kombiniert werden und einzigartig sind.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_22.wav", "doc_id": "aQpIWggfCo.seg_22", "src_text": "We first show constraint types with examples for InstructGPT and obtain specific goals based on the seed abstract goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Typen mit Beispielen für Intra-GPT und erreichen spezifische Ziele basierend auf den genannten abstrakten Zielen.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_224.wav", "doc_id": "oYCKgTzTDy.seg_224", "src_text": "For example, there's only one single model to evaluate them.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "zum Beispiel gibt es nur ein einziges Modell, um sie zu bewerten.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_155.wav", "doc_id": "wLqFAuDnKa.seg_155", "src_text": "However, the \"Style/Awkward\" category for PaLM is lower than for the state-of-the-art systems, which is an additional signal that PaLM provides really fluent output, but still with some problems of accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Allerdings ist die Kategorie „Stylish Outwear“ für Palm niedriger als für die State-of-the-Art-Systeme, was ein zusätzliches Zeichen ist. Das Parm liefert einen wirklich fließenden Output, aber immer noch mit einigen Problemen der Genauigkeit.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_308.wav", "doc_id": "dJGfOSFgZO.seg_308", "src_text": "Hello, I'm James Finch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo, ich bin James Finch,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_797.wav", "doc_id": "WTTtiRKFZI.seg_797", "src_text": "It's okay the way instead of \"it\", we have this long NP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ist in Ordnung, und dafür haben wir das lange und das kurze.", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_39.wav", "doc_id": "aQpIWggfCo.seg_39", "src_text": "With CoScript we can try smaller but specialized models for constrained language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Mit CoScript können wir kleinere, aber spezialisierte Modelle für eingeschränkte Sprachplanung auswählen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_374.wav", "doc_id": "gGbuDbHhyc.seg_374", "src_text": "Our concrete recommendations for future work are as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unsere konkreten Empfehlungen für zukünftige Arbeiten sind wie folgt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_186.wav", "doc_id": "SLpqvupgvW.seg_186", "src_text": "Do you mean A or B?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "A oder B,", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_36.wav", "doc_id": "aQpIWggfCo.seg_36", "src_text": "To ensure the quality of the validation and test set, we ask crowd-sourced workers to find and revise the incorrect samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "um die Güte der Validierung und der Testdaten sicherzustellen, und wir bitten um. Claude forderte die Arbeiter auf, die falschen Proben endlich zu überprüfen.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_482.wav", "doc_id": "SUkmfOTvGi.seg_482", "src_text": "And last but not least, we all know that the number of fine tuning examples directly affects the performance of a downstream task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und schließlich, aber nicht am wenigsten, wissen wir alle, dass die Anzahl von Feinabstimmungsbeispielen die Leistung einer Downstream-Aufgabe direkt beeinflusst.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_417.wav", "doc_id": "WBLMIsdIrq.seg_417", "src_text": "We can then also note that different languages have different proportions of these discourse phenomena.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Man kann dann auch feststellen, dass die verschiedenen Sprachen unterschiedliche Proportionen dieser Phänomene haben.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_413.wav", "doc_id": "WBLMIsdIrq.seg_413", "src_text": "And this allows us to identify phenomena that cannot really be captured by the word itself, but that's rather expressed in the sentence structure, such as ellipses resolution.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die nicht wirklich von der Welt selbst erfasst werden können, sondern eher in einer Struktur ausgedrückt werden. Jetzt", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_57.wav", "doc_id": "TVCREhgqUP.seg_57", "src_text": "In this example, the model has seen shallow recursion during training and is tested on an example with deeper recursion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In diesem Beispiel hat das Modell eine flache Rekurrenz während des Trainings und wird mit tiefer Rekurrenz getestet.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_91.wav", "doc_id": "TVCREhgqUP.seg_91", "src_text": "If you want to learn more about our experiments and how we address these challenges, please have a look at our paper or come to our poster.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn Sie mehr über unsere Experimente und wie wir diese Herausforderungen angehen, erfahren wollen, dann schauen Sie sich bitte unser Papier an oder kommen Sie zu unserem Posten.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_796.wav", "doc_id": "WTTtiRKFZI.seg_796", "src_text": "\"Marge read this absolutely fascinating book about bees yesterday.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "in Ordnung, das Buch über das Heute ist absolut", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_618.wav", "doc_id": "oeooqChmKK.seg_618", "src_text": "In the Background-Pretrain setting, we assume that the background knowledge \"Politicians seek elected seats in government\" is contained in the pretrained parameters and in inference-time context we provide the entity-specific knowledge \"Chichester is a politician.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Er kehrte zurück zur Voreinstellung. Wir nehmen an, dass Politiker gewählte Sitze in der Regierung anstreben. Es ist in den vorgebildeten Parametern enthalten. Im Kontext der Freiheit? Wir liefern die antipsychologische Kenntnis: Chichester ist ein Politiker..", "score": 54.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_76.wav", "doc_id": "TVCREhgqUP.seg_76", "src_text": "For the first output position, we simply select one, as highlighted in red.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Für die erste Ausgabestelle wählen wir einfach eine, wie in Rot hervorgehoben.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_776.wav", "doc_id": "WTTtiRKFZI.seg_776", "src_text": "So these two approaches are asymmetric.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "so dass diese beiden Ansätze", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_725.wav", "doc_id": "oaOHnMCwad.seg_725", "src_text": "And so that concludes our presentation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und so die Präsentation, aber", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_280.wav", "doc_id": "PIZEXUFLAR.seg_280", "src_text": "So for the training dataset, we use 53 tasks from 9 groups for training and we sample 10,000 instances per task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher verwenden wir für die Trainingsdatenbank 53 Aufgaben aus der Gruppe 9 für die Ausbildung und wir stellen 10.000 Instanzen pro Aufgabe", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_744.wav", "doc_id": "XejEJmgUmE.seg_744", "src_text": "And what we do is that to recreate like longer sequences and which are acceptable and which has the same matching of the grammatical structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was wir tun, ist, längere Sequenzen zu rekonstruieren, die akzeptabel sind und die gleiche Grammatikstruktur haben.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_104.wav", "doc_id": "uZBWfYjYnf.seg_104", "src_text": "That is the cross-attention mechanism, and you can see an example on the right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "das Cross-Attention-Mechanism, und Sie können ein Beispiel auf der rechten Seite sehen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_338.wav", "doc_id": "dJGfOSFgZO.seg_338", "src_text": "We hope ABC-Eval can be leveraged by others in the field as a meaningful step in this direction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir hoffen, dass ABC-EVAL von anderen auf diesem Gebiet als ein sinnvoller Schritt in diese Richtung genutzt werden kann,", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_670.wav", "doc_id": "FLkGnzVRew.seg_670", "src_text": "We also find that iterative update is useful for transfer learning from a different domain, whereas in domain active annotations benefit from cumulative update.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir finden auch heraus, dass die iterative Aktualisierung für das Transferlernen aus einer anderen Domäne nützlich ist, wobei aktive Anmerkungen in der Domäne von der Aktualisierung profitieren.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_165.wav", "doc_id": "SLpqvupgvW.seg_165", "src_text": "Here, a user wants to select between one of these two songs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Benutzer möchte zwischen diesen beiden Liedern wählen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_435.wav", "doc_id": "hgIDlKNiFM.seg_435", "src_text": "We also introduced a comparison of models with multiple pre-training settings and data sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir stellen außerdem einen Vergleich von Modellen mit mehreren präzisen Einstellungen und Datenquellen vor.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_436.wav", "doc_id": "hgIDlKNiFM.seg_436", "src_text": "Then, we present our results on 11 biomedical and clinical downstream tasks in French.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "stellen wir unsere Ergebnisse auf elf biomedizinischen und klinischen Downstream-Aufgaben in Französisch vor", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_138.wav", "doc_id": "wLqFAuDnKa.seg_138", "src_text": "In our experiments, we settled for a 5-shot prompting strategy where we just marked each sentence that we provide to the system, with the language it's in.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In unseren Experimenten haben wir uns für eine Strategie mit fünf Schüssen entschieden, bei der wir die Sätze, die wir dem System zur Verfügung stellen, einfach markieren.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_657.wav", "doc_id": "FLkGnzVRew.seg_657", "src_text": "Thus, this is the model that we use to cold start the active learning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dies das Modell, das wir verwenden, um das aktive Lernen zu", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_86.wav", "doc_id": "TVCREhgqUP.seg_86", "src_text": "In addition, sometimes there are multiple permutations that are consistent with the data, but the linguistically correct one is latent.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Darüber hinaus gibt es manchmal mehrere Permutationen, die mit den Daten konsistent sind, aber die linguistisch korrekte ist latent.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_605.wav", "doc_id": "oeooqChmKK.seg_605", "src_text": "The task here is to identify the correct entity that the pronoun \"he\" refers to, which in this case is Servin.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Aufgabe hier besteht darin, die korrekte Entität zu identifizieren, auf die sich das Pronomen bezieht, was in diesem Fall ein Bediensteter ist.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_464.wav", "doc_id": "SUkmfOTvGi.seg_464", "src_text": "Today I'm going to present our paper Do CoNLL-2003 named entity taggers still work well in 2023?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Heute werde ich unsere Arbeit vorstellen: 'Do CONLL 2003 named entity taggers still work well in 2023?' Lassen", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_45.wav", "doc_id": "aQpIWggfCo.seg_45", "src_text": "Thanks for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vielen Dank für Ihre Zeit,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_157.wav", "doc_id": "wLqFAuDnKa.seg_157", "src_text": "For more details, please come to the full presentation of the paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "für mehr Details kommen Sie bitte zur vollständigen Präsentation des Papiers,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_234.wav", "doc_id": "oYCKgTzTDy.seg_234", "src_text": "We also test Monolingual Few-shot setting by training monolingual models with only 10% of training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir testen auch die monolinguale Führerschein-Prüfung, indem wir monolinguale Modelle mit nur zwölf Prozent der Ausbildungsdaten trainieren.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_124.wav", "doc_id": "wLqFAuDnKa.seg_124", "src_text": "PaLM is a 540 billion-parameter large language model presented last year in 2022.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Farm ist ein groß angelegtes Modell der Groß- und Kleinschreibung, das im vergangenen Jahr vorgestellt wurde.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_92.wav", "doc_id": "uZBWfYjYnf.seg_92", "src_text": "Hi, I'm Sara Papi from the University of Trento and Foundazione Bruno Kessler and I will briefly introduce the \"Attention as a Guide for Simultaneous Speech Translation\" paper, that is a joint work with Matteo Negri and Marco Turchi.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo, ich bin Serapapi von der Universität von Trento und der Stiftung Bruno Kessler, und ich werde kurz die Aufmerksamkeit als Leitfaden für gleichzeitige Sprachübersetzungspapier erläutern, das eine gemeinsame Arbeit mit Matteo Negri und Marco Turki ist.", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_716.wav", "doc_id": "oaOHnMCwad.seg_716", "src_text": "We find this in the GPT 4 social acceptability task as well as the Dynahate task analysis as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "vergleichen. Dies finden wir in der GPD-Sozialverträglichkeitsprüfung, wie z. B. in der Dienen-Task-Analyse.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_45.wav", "doc_id": "aQpIWggfCo.seg_45", "src_text": "Thanks for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank für Ihre Zeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_223.wav", "doc_id": "oYCKgTzTDy.seg_223", "src_text": "The Lambda calculus is missing, or they're only evaluated on certain neural models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Kannada, Malayalam, Sinhala, Urdu, Pashto, Hausa, Swahili, Amharisch, Somali, Zulu, Oromo,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_659.wav", "doc_id": "FLkGnzVRew.seg_659", "src_text": "\"Cumulative\" accumulates all the data collected from active annotation so far, whereas \"Iterative\" updates the model by training on the latest set of data collected.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Alle Daten, die wir aus den aktiven Anmerkungen gesammelt haben, werden kumuliert. Bei den unterschiedlichen Strategien stellen wir fest, dass", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_538.wav", "doc_id": "dvGkKzmIaN.seg_538", "src_text": "We assume the provider apply wiki text data set to count word frequency.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir gehen davon aus, dass der Anbieter den Wikitext-Datensatz verwendet, um die Häufigkeit von Wörtern zu zählen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_395.wav", "doc_id": "WBLMIsdIrq.seg_395", "src_text": "First, when does translation require context?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Erstens, was sind die Voraussetzungen für die Übersetzung und", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_308.wav", "doc_id": "dJGfOSFgZO.seg_308", "src_text": "Hello, I'm James Finch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo, ich bin James Finch", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_39.wav", "doc_id": "aQpIWggfCo.seg_39", "src_text": "With CoScript we can try smaller but specialized models for constrained language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "mit CoScript können wir kleinere, aber spezialisierte Modelle für die konstrizierte Sprachplanung auswählen.", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_755.wav", "doc_id": "XejEJmgUmE.seg_755", "src_text": "We increase the context length toward up to 1024 for to max out OPT and GPT 2 models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir haben die Kontextlänge auf bis zu Tausend erhöht, um die Maximalausgabe von Opts und Gpt2-Modellen zu ermöglichen, und", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_790.wav", "doc_id": "WTTtiRKFZI.seg_790", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sieht, weil", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_544.wav", "doc_id": "dvGkKzmIaN.seg_544", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dank.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_839.wav", "doc_id": "GvEBWkLmuI.seg_839", "src_text": "Immediately we see that, while the outputs aren't overtly negative or toxic in the traditional sense of these words, there are some interesting patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir sehen sofort, dass die Ergebnisse in diesen Fällen negativ oder toxisch sind. Das sind einige interessante Muster.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_45.wav", "doc_id": "aQpIWggfCo.seg_45", "src_text": "Thanks for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank für Ihre Zeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_474.wav", "doc_id": "SUkmfOTvGi.seg_474", "src_text": "We evaluated them on both the CoNLL-03 test sets and the CoNLL++.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "2003 fein abgestimmt und beurteilt, sowohl das Cornu III-Testset als auch das Cornu Plus-Testset. Und schließlich,", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_0.wav", "doc_id": "aQpIWggfCo.seg_0", "src_text": "Hi, I'm Siyu Yuan from Fudan University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo, ich bin Siyuan von der Universität von Fudan.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_448.wav", "doc_id": "hgIDlKNiFM.seg_448", "src_text": "One based on the weight of CamemBERT and trained on a 4 GB set of NACHOS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Eines basiert auf dem Gewicht von Camembert und trainiert auf vier Gigabyte, ein anderes basiert auf", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_464.wav", "doc_id": "SUkmfOTvGi.seg_464", "src_text": "Today I'm going to present our paper Do CoNLL-2003 named entity taggers still work well in 2023?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hong, heute werde ich unsere Arbeit vorstellen: Funktionieren die Etiketten von Corel Corporation noch gut im Jahr", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_254.wav", "doc_id": "oYCKgTzTDy.seg_254", "src_text": "We also find some other interesting findings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellen auch einige andere interessante Ergebnisse", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_190.wav", "doc_id": "SLpqvupgvW.seg_190", "src_text": "The first one is uniform at random.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die erste Methode ist die gleichmäßige", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_758.wav", "doc_id": "XejEJmgUmE.seg_758", "src_text": "So here we are choosing or creating sentences from acceptable and unacceptable domains from the same BLiMP or SyntaxGym dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher wählen wir oder erstellen Sätze aus akzeptablen und unakzeptablen Domänen aus dem gleichen Blimp- oder Syntax-Datensatz.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_587.wav", "doc_id": "rISrKoXQCx.seg_587", "src_text": "I think that's pretty much all I have for today.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ich denke, das ist ziemlich viel, was ich für heute erreicht", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_54.wav", "doc_id": "TVCREhgqUP.seg_54", "src_text": "And \"Mary knew that the girl slept.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und im Mary-new-that-slip. Dies sind die Ausdrücke,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_311.wav", "doc_id": "dJGfOSFgZO.seg_311", "src_text": "This work was done by the Emory NLP Lab led by Professor Jinho Choi at Emory University and in collaboration with Amazon Alexa AI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Arbeit wurde vom Emery N. P. Lab unter der Leitung von Professor Gino Choy an der Emery University und in Zusammenarbeit mit Amazon Alexa AI durchgeführt.", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_621.wav", "doc_id": "oeooqChmKK.seg_621", "src_text": "We evaluate the data set both with human study participants, and established coreference resolution models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir bewerten den Datensatz sowohl mit den Ergebnissen der Studienteilnehmer als auch mit den festgelegten Lösungsmodellen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_461.wav", "doc_id": "hgIDlKNiFM.seg_461", "src_text": "All the pre-trained model obtained from NACHOS are freely available on Hugging Face, and under the MIT license, and all the training scripts are on our GitHub repository.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Alle vorbereitenden Modelle, die von Natchos erhalten wurden, sind frei verfügbar und auf YouTube und alle Trainings-Skripte sind auf unserem Git Repository verfügbar. Daher", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_276.wav", "doc_id": "PIZEXUFLAR.seg_276", "src_text": "Here we show some example instances from our MultiInstruct dataset, to unify the processing of various input and output data types.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier zeigen wir einige Beispiele aus unserem Multi-Instanz-Datensatz. Um die Verarbeitung verschiedener Eingabe- und Ausgabedatentypen zu vereinheitlichen.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_589.wav", "doc_id": "oeooqChmKK.seg_589", "src_text": "Hello everyone, I'm Akshatha, and today my co-author Martin and I are presenting our work \"The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo alle, ich bin amtierender und heute präsentiere ich meine Arbeit zum Thema Wissensintegration aus mehreren Quellen.", "score": 51.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_287.wav", "doc_id": "PIZEXUFLAR.seg_287", "src_text": "So during test for each task, we conduct a total of 5 experiments by evaluating the model using one of the five instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Während des Tests für jede Aufgabe führen wir insgesamt fünf Experimente durch, indem wir das Modell mit einer der fünf Anweisungen", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_471.wav", "doc_id": "SUkmfOTvGi.seg_471", "src_text": "To investigate these problems, we developed the CoNLL++ Dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um diese Probleme zu untersuchen, entwickeln wir den Datensatz „Carno", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_848.wav", "doc_id": "GvEBWkLmuI.seg_848", "src_text": "So the Marked Words method draws upon the sociolinguistic concept of \"markedness\", which states that there is an unmarked default, and any group that differs from that default is linguistically marked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Markierungs-Methode basiert auf dem sozio-linguistischen Konzept der Markiertheit, das besagt, dass es eine unmarkierte Definitheit gibt und dass jede Gruppe, die sich von dieser Definitheit unterscheidet, sprachlich markiert ist.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_300.wav", "doc_id": "PIZEXUFLAR.seg_300", "src_text": "So this shows the effect of different fine-tuning strategies on the model sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies zeigt den Einfluss verschiedener Fine-Tuning-Strategien auf die Modell-Sensitivität.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_671.wav", "doc_id": "FLkGnzVRew.seg_671", "src_text": "These are the links to our core data set and our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies sind die Links zu Ihrem Code-Set und Ihrem Papier.", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_668.wav", "doc_id": "FLkGnzVRew.seg_668", "src_text": "However, the annotators also find the examples difficult.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "die Annotatoren finden die Beispiele auch schwierig. \"Nein, ich bin nicht hier, um zu lachen.\"", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_0.wav", "doc_id": "aQpIWggfCo.seg_0", "src_text": "Hi, I'm Siyu Yuan from Fudan University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo, ich bin von der Universität von", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_620.wav", "doc_id": "oeooqChmKK.seg_620", "src_text": "In the Background-Inference setting, we provide the fictional occupation \"mirituer\" instead of politician because \"mirituer\" is unlikely to be contained in the pretrained parameters.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "man sich auf die eigentliche Sache konzentrieren. Sie bieten eine fiktive Beschäftigung meritocracy anstelle von Politiker. Denn Miretta ist unwahrscheinlich in den vorgegebenen Parametern enthalten.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_585.wav", "doc_id": "rISrKoXQCx.seg_585", "src_text": "So it's kind of like the electric trolley problem.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ich denke, das ist ziemlich", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_622.wav", "doc_id": "oeooqChmKK.seg_622", "src_text": "In this figure, we show the results of the best-performing models on the most difficult variant of the Background-Pretrain setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In dieser Abbildung zeigen wir die Ergebnisse der besten Leistungsmodelle auf dem schwierigsten Variante des Hintergrundvorbereitungssettings. mit", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_575.wav", "doc_id": "rISrKoXQCx.seg_575", "src_text": "We further show many qualitative examples to see that language models with different political leanings do give different predictions to hate speech and misinformation examples based on their social categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "viele qualitative Beispiele vorgestellt, um zu zeigen, dass Sprachmodelle mit unterschiedlichen politischen Bedeutungen unterschiedliche Vorhersagen zu Hassrede und Fehlinformationen basierend auf sozialen Kategorien", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_648.wav", "doc_id": "FLkGnzVRew.seg_648", "src_text": "As can be seen here, dissonance was only found in 3.5% of the annotated pairs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wie man hier sehen kann, wurde Dissonanz nur in 3,5% der annotierten Paare gefunden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_148.wav", "doc_id": "wLqFAuDnKa.seg_148", "src_text": "And their results so a better performance when using the dev data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Trainingsdaten viel besser. Spezialisierte Systeme", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_749.wav", "doc_id": "XejEJmgUmE.seg_749", "src_text": "So here the sentences are still coming from a, relevant data sets but it's not from the same data set that you are evaluating with.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier kommen die Sätze immer noch aus relevanten Datensätzen, aber nicht aus demselben Datensatz, mit dem Sie bewerten, und wir", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_21.wav", "doc_id": "aQpIWggfCo.seg_21", "src_text": "Thus, we adopt the idea of over-generate-then-filter to improve generation quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "daher haben wir die Idee des überproduzierten Tiefenfilters angenommen, um die Erzeugungsqualität zu verbessern.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_49.wav", "doc_id": "TVCREhgqUP.seg_49", "src_text": "This is joint work with my advisors Alexander Koller and Ivan Titov.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies ist gemeinsame Arbeit mit meinen Beratern Alexander Koller und Ivan Titov.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_379.wav", "doc_id": "gGbuDbHhyc.seg_379", "src_text": "Finally, we have open-sourced our code.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Schließlich haben wir unseren Code mit offener Quelle,", "score": 77.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_5.wav", "doc_id": "aQpIWggfCo.seg_5", "src_text": "However, previous work mainly focuses on planning for the abstract goals of stereotypical activities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vorherige Arbeiten konzentrierten sich jedoch hauptsächlich auf die Planung für die abstracten Ziele stereotypischer Aktivitäten;", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_649.wav", "doc_id": "FLkGnzVRew.seg_649", "src_text": "On collecting around 1,000 examples of discourse unit pairs, we ran training for an initial classifier trained only on 43 examples of dissonance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn wir uns um tausende Beispiele von Diskurs-Einheiten drehen, trainieren wir für einen Initialklassifikator, trainieren wir nur an 43 Beispielen", "score": 49.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_805.wav", "doc_id": "WTTtiRKFZI.seg_805", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "richtig, es", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_169.wav", "doc_id": "SLpqvupgvW.seg_169", "src_text": "Or the pronunciations are too similar to each other and hard to disambiguate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Alle Aussprachen sind zu ähnlich miteinander und schwer zu unterscheiden.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_168.wav", "doc_id": "SLpqvupgvW.seg_168", "src_text": "This could happen when the user cannot remember the name of the song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies würde passieren, wenn der Benutzer den Namen der Liedes nicht mehr merken kann.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_428.wav", "doc_id": "WBLMIsdIrq.seg_428", "src_text": "To summarize, we perform a data-driven analysis across 14 language pairs to identify when translations require context and then we use our findings to build a benchmark for document-level machine translation which can help us identify which discourse phenomena models can handle well or not, and which translation systems are good at document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "für die lokale Dokumentübersetzung. Zusammengefasst führen wir Datenanalysen in vierzehn Sprachen durch, um zu identifizieren, wann Übersetzungen erforderlich sind. Und wir verwenden unsere Erkenntnisse, um einen Benchmark für die Dokumentenebene-Übersetzung zu erstellen, der dabei helfen kann, zu identifizieren, welche Phänomene beherrscht werden können und welche nicht, und welche Übersetzungssysteme auf der Dokumentenebene gut sind.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_520.wav", "doc_id": "dvGkKzmIaN.seg_520", "src_text": "Embedding marker contains two main steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Embedding-Marker enthält zwei Hauptschritte:", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_226.wav", "doc_id": "oYCKgTzTDy.seg_226", "src_text": "We provide a uniform data set XSemPLR for cross-lingual semantic parsing in multiple natural languages and meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Irisch, Scots, Faroese, Isländisch, Färöisch, Norwegisch, Dänisch, Niederländisch, Deutsch, Schwedisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_717.wav", "doc_id": "oaOHnMCwad.seg_717", "src_text": "So, given that there is positionality in NLP, what can we do about it?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "tun, dass es eine Position in LED und LP gibt? Wir haben einige", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_120.wav", "doc_id": "uZBWfYjYnf.seg_120", "src_text": "And we also released open source the code and models and simultaneous output to facilitate the reproducibility of our work.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Papier und wir veröffentlichen auch den Quellcode und die Modelle und Simulatoren, um die Reproduzierbarkeit unserer Arbeit zu erleichtern,", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_453.wav", "doc_id": "hgIDlKNiFM.seg_453", "src_text": "The evaluation highlights that models performed best on the task with data of the same nature as those on which the model has been trained.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Ze Die Bewertung hebt", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_76.wav", "doc_id": "TVCREhgqUP.seg_76", "src_text": "For the first output position, we simply select one, as highlighted in red.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Für die erste Ausgabeposition wählen wir einfach den rot markierten.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_489.wav", "doc_id": "SUkmfOTvGi.seg_489", "src_text": "And this shows us that adaptive overfitting in this case is not observed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies zeigt uns, dass adaptiver Überanschnitt in diesem Fall nicht beobachtet wird.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_66.wav", "doc_id": "TVCREhgqUP.seg_66", "src_text": "In this paper, we don't use trees and introduce a neural seq2seq model that directly models the correspondences between fragments of the input and fragments of the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In diesem Papier verwenden wir keine „Trees“ und stellen ein neues Sequenz-Modell vor, das die Korrespondenzen zwischen Fragmenten des Inputs und Fragmenten des Outputs direkt modelliert.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_244.wav", "doc_id": "oYCKgTzTDy.seg_244", "src_text": "We found that Encoder-Decoder obtains the best performance on all nine datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellten fest, dass der Encoder-Decoder die beste Leistung auf allen neun Datensätzen erzielt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_779.wav", "doc_id": "WTTtiRKFZI.seg_779", "src_text": "Now those are asymmetric approaches to coordinate structures, such as the Prague approach.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "herauslösen. Jetzt gibt es auch symmetrische Ansätze zur Koordinierung von Koordinatursystemen wie dem", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_684.wav", "doc_id": "oaOHnMCwad.seg_684", "src_text": "Positionality is simply the perspectives that people hold as a result of their demographics, identity, and life experiences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Menschen. Dies ist ein Konzept, das in kritischen Studien", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_108.wav", "doc_id": "uZBWfYjYnf.seg_108", "src_text": "This means that the first two words will be emitted while since the sum of the cross-attention is above a certain threshold alpha, we will not emit the last word and we wait for another speech chunk.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies bedeutet, dass die ersten beiden Wörter weggelassen werden. Während die Summe der Kreuzanmerkung über einem bestimmten Alpha-Grenzwert liegt, werden wir das letzte Wort nicht aussprechen und auf einen anderen Satz warten.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_681.wav", "doc_id": "oaOHnMCwad.seg_681", "src_text": "Where prospective AP is really not as sensitive to offensive terms that are more common in Indian contexts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "die Perspektiv-API. AI ist wirklich nicht so empfindlich gegen beleidigende Begriffe, die in indischen Kontexten häufiger vorkommen.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_669.wav", "doc_id": "FLkGnzVRew.seg_669", "src_text": "In summary, we find that PRC is a simple AL strategy for rare class acquisition and cold starting AL with appropriately designed transfer learning task and help significantly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Insgesamt stellen wir fest, dass die PRCA eine einfache AL-Strategie für die Akquisition von Hochschulabsolventen ist, und starten AL mit angemessen entworfenen Transfer-Lernaufgaben, die helfen können, signifikant.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_170.wav", "doc_id": "SLpqvupgvW.seg_170", "src_text": "Or when the user wants to specify a preference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "oder wenn der Benutzer eine Vorliebe spezifizieren möchte.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_672.wav", "doc_id": "FLkGnzVRew.seg_672", "src_text": "Feel free to get in touch with us if you have any questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Fühlen Sie sich frei, mit ihnen in Kontakt zu treten,", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_119.wav", "doc_id": "uZBWfYjYnf.seg_119", "src_text": "If you want to discover more results, read our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn Sie mehr Ergebnisse finden möchten, lesen Sie bitte unser", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_292.wav", "doc_id": "PIZEXUFLAR.seg_292", "src_text": "So this measures the model's ability to consistently produce the same outputs for the same task regardless of the slight variation in the wording of the instruction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die Fähigkeit des Modells misst, dieselben Ausgänge für dieselbe Aufgabe zu produzieren, unabhängig von einer geringen Variation der Wortwahl der Anweisung.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_8.wav", "doc_id": "aQpIWggfCo.seg_8", "src_text": "An abstract goal can be inherited by different real-life specific goals with multi-faceted constraints.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ein abstraktes Ziel kann von unterschiedlichen spezifischen Zielen im realen Leben mit mehrfachen Beschränkungen geerbt werden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_176.wav", "doc_id": "SLpqvupgvW.seg_176", "src_text": "The cartoon has three speech bubbles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der Cartoon hat drei Sprachblasen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_102.wav", "doc_id": "uZBWfYjYnf.seg_102", "src_text": "Use only one model for every latency regime and handle latency through specific parameters.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "erstellen. Verwenden Sie nur ein Modell für jedes Latenzregime und bearbeiten Sie die Latenz über spezifische Parameter.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_460.wav", "doc_id": "hgIDlKNiFM.seg_460", "src_text": "We are also observing that more specialized data is better, but it doesn't scale well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "stellen wir fest, dass spezialisierte Daten besser sind, mehr spezialisierte Daten sind", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_398.wav", "doc_id": "WBLMIsdIrq.seg_398", "src_text": "In the previous work, we introduced CXMI as a measure for context usage by machine translation models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In der vorangegangenen Arbeit stellen wir XMI als Maßstab für die Verwendung von Maschinenübersetzungsmodellen vor, und dies wird", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_840.wav", "doc_id": "GvEBWkLmuI.seg_840", "src_text": "The Asian woman is depicted as unassuming; the Middle-Eastern woman is referred to using words like exotic and like, referring to a mesmerizing region.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die asiatische Frau wird als unscheinbar dargestellt, die mittelöstliche Frau wird mit Ausdrücken wie exotisch bezeichnet. Und wie man sich auf eine faszinierende Region bezieht,", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_831.wav", "doc_id": "GvEBWkLmuI.seg_831", "src_text": "However, these measures have various limitations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Allerdings haben diese Maßnahmen verschiedene Einschränkungen:", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_652.wav", "doc_id": "FLkGnzVRew.seg_652", "src_text": "To alleviate this, we experiment over combinations of transfer learning and active learning to annotate such that more dissonant samples can be collected over lesser annotation runs, lowering the overall annotation costs while improving dissonance detection.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um dies zu erleichtern, wurden Experimente über Kombinationen von Transfer-Lernen und aktiven Lernen durchgeführt, um zu annotieren, dass mehr dissonante Beispiele über niedrigere Annotation-Runden gesammelt werden können, indem die Gesamtkosten der Annotation durch die Verbesserung der Dissonanzdetektion gesenkt werden.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_425.wav", "doc_id": "WBLMIsdIrq.seg_425", "src_text": "But these models are not much better than models that do not use context on other phenomena like ellipsis, pronouns, and verb form.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "aber diese Modelle sind nicht viel besser als Modelle, die diese Phänomene nicht verwenden,", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_694.wav", "doc_id": "oaOHnMCwad.seg_694", "src_text": "The first step is to re annotate data sets with diverse annotators.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der erste Schritt besteht darin, Datensätze mit verschiedenen Annotatoren neu zu annotieren.", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_46.wav", "doc_id": "aQpIWggfCo.seg_46", "src_text": "Please find more details of CoScript in our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "bitte finden Sie mehr Details von CoScript in unserem Papier.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_821.wav", "doc_id": "WTTtiRKFZI.seg_821", "src_text": "So I'll concentrate on the right one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "rechten Wort, mich auf das rechte Wort konzentriere.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_22.wav", "doc_id": "aQpIWggfCo.seg_22", "src_text": "We first show constraint types with examples for InstructGPT and obtain specific goals based on the seed abstract goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zunächst zeigen wir konstruktionsbedingte Typen mit Beispielen für Integrität und erhalten spezifische Ziele basierend auf den genannten Abschnitten. Daher. Dann generiert", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_561.wav", "doc_id": "rISrKoXQCx.seg_561", "src_text": "Secondly, we aim to investigate to which extent the political biases of language models are actually picked up from training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zweitens werden wir die politischen Sprachmodelle untersuchen, die sich auf den Bereich beziehen,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_381.wav", "doc_id": "gGbuDbHhyc.seg_381", "src_text": "Please feel free to check it out.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "finden, bitte fühlen Sie sich frei, ihn auszuprobieren.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_609.wav", "doc_id": "oeooqChmKK.seg_609", "src_text": "Generally, background knowledge is learned during the pretraining of large language models, while entity-specific knowledge is typically observed at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Fälle entscheiden. In der Regel wird Hintergrundwissen während der Vorbereitung großer Sprachmodelle gelernt, während spezifisches Wissen über Entitäten typischerweise während der Inferenzzeit beobachtet wird.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_222.wav", "doc_id": "oYCKgTzTDy.seg_222", "src_text": "But Chinese is missing and lack of coverage on certain meaning representation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Das Chinesische fehlt. Leuchttürme konnten viele ungewisse Repräsentationen bedecken. Die", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_528.wav", "doc_id": "dvGkKzmIaN.seg_528", "src_text": "The weight of the target embedding is proportional to the number of triggers in the sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Das Gewicht der Zielverankerung ist proportional zur Anzahl der Auslöser in einer Satz.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_35.wav", "doc_id": "aQpIWggfCo.seg_35", "src_text": "In total, we generate 55,000 specific goals with scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Insgesamt generieren wir fünfzigtausend spezifische Ziele mit Skripten,", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_209.wav", "doc_id": "SLpqvupgvW.seg_209", "src_text": "If the language model has access to some partially overlapping background knowledge, then the accuracy is between 82 to 87%, which is more realistic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn das Sprachmodell teilweise überlappendes Hintergrundwissen hat, liegt die Genauigkeit zwischen achtundzwanzig und achtundachtzig Prozent, was realistischer ist,", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_221.wav", "doc_id": "oYCKgTzTDy.seg_221", "src_text": "For instance, there are lots of coverage on certain natural languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Kroatisch, Tschechisch, Bulgarisch, Rumänisch, Litauisch, Lettisch, Estnisch, Georgisch, Armenisch, Aserbaidschanisch, Kasachisch, Tadschikisch,", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_158.wav", "doc_id": "wLqFAuDnKa.seg_158", "src_text": "Thank you very much.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_75.wav", "doc_id": "TVCREhgqUP.seg_75", "src_text": "We go from left to right over the output and determine which multiset token to put in every position.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir gehen von links nach rechts über den Ausgang und bestimmen, welche Multiset-Token in jeder Position gesetzt werden sollen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_20.wav", "doc_id": "aQpIWggfCo.seg_20", "src_text": "Previous studies have shown that the output quality of language models falls in high variance, leading to bad performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Frühere Studien haben gezeigt, dass die Ausgangsqualität von Laser-Modellen in hohen Bereichen abfällt, was zu schlechter Leistung führt.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_590.wav", "doc_id": "oeooqChmKK.seg_590", "src_text": "This work is a collaboration between McGill University, Mila, and Microsoft Research.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "eine Zusammenarbeit zwischen der Universität McGill und Microsoft Research ist.", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_684.wav", "doc_id": "oaOHnMCwad.seg_684", "src_text": "Positionality is simply the perspectives that people hold as a result of their demographics, identity, and life experiences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Positionalität ist einfach die Perspektiven, die Menschen als Ergebnis ihrer Demographie, Identität und Lebenserfahrungen haben.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_812.wav", "doc_id": "WTTtiRKFZI.seg_812", "src_text": "So the proportion is bigger of the left short conjunct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "stärkere, richtig? Und die Proportion ist größer als die der linken kürzeren Konjunktion. Aber was in", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_100.wav", "doc_id": "uZBWfYjYnf.seg_100", "src_text": "So what is our solution?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist unsere Lösung?", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_290.wav", "doc_id": "PIZEXUFLAR.seg_290", "src_text": "If it's a multi-modal generation task, we report Rouge-L. For NLP task, we report Rouge-L as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "es sich um eine multimodale Generierung handelt, berichten wir über RUGL, für NRP-Aufgaben berichten wir auch über RUGL.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_41.wav", "doc_id": "aQpIWggfCo.seg_41", "src_text": "In summary, we establish the constrained language planning problem.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zusammenfassend: Wir stellen das Problem der eingeschränkten Sprachplanung fest,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_614.wav", "doc_id": "oeooqChmKK.seg_614", "src_text": "Lastly, the \"Background-Inference\" setting, where both knowledge types are available only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und die Vordergrund-Einstellungen nur während des Trainings verfügbar sind.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_94.wav", "doc_id": "uZBWfYjYnf.seg_94", "src_text": "Simultaneous speech translation, or SimulST, is the process of translating spoken language into a text in another language in real time, enabling cross-language communication.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sprachübersetzung? Simultane Sprachübersetzung ist der Prozess der Übersetzung einer gesprochenen Sprache in einen Text in einer anderen Sprache in Echtzeit, wodurch eine Kreuzsprachkommunikation ermöglicht wird.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_419.wav", "doc_id": "WBLMIsdIrq.seg_419", "src_text": "And finally, we use our benchmark as well as other metrics to evaluate different models on the document-level machine translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und schließlich verwenden wir unseren Benchmark sowie andere Metriken, um verschiedene Modelle auf Dokumentebene zur maschinellen", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_220.wav", "doc_id": "oYCKgTzTDy.seg_220", "src_text": "Existing cross-lingual semantic parsing models are separately proposed and evaluated on data set of limited tasks and applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Portugiesisch, Italienisch, Polnisch, Griechisch, Türkisch, Persisch, Vietnamesisch, Thai, Indonesisch, Schwedisch, Dänisch, Niederländisch, Norwegisch, Finnisch, Ungarisch, Slowakisch, Slowenisch,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_83.wav", "doc_id": "TVCREhgqUP.seg_83", "src_text": "In our paper, we solve a couple of interesting technical challenges.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In unserer Zeitung werden interessante technische Herausforderungen", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_661.wav", "doc_id": "FLkGnzVRew.seg_661", "src_text": "Next, to improve the number of dissonance examples, we use a Probability-of-Rare-Class strategy — PRC — to select mostly the examples that are highly likely to be descended by the current model at any round of rare.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um die Anzahl der Divergenzbeispiele zu verbessern, verwenden wir die Wahrscheinlichkeit einer echten Klassifizierungsstrategie, wobei die meisten Beispiele in jeder Runde höchstwahrscheinlich durch das aktuelle Modell unterschieden werden.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_479.wav", "doc_id": "SUkmfOTvGi.seg_479", "src_text": "Through our experiments we found that the transformer models normally generalize better to new data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "durch unsere Experimente haben wir festgestellt, dass die Transformer-Modelle normalerweise besser auf neue Daten generalisiert", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_816.wav", "doc_id": "WTTtiRKFZI.seg_816", "src_text": "It's absent in the second example \"Homer came and sneezed.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Im zweiten Beispiel wird", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_87.wav", "doc_id": "TVCREhgqUP.seg_87", "src_text": "We address this by inducing the alignment as part of the training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir beheben dies, indem wir die Ausrichtung als Teil des Trainings induzieren.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_394.wav", "doc_id": "WBLMIsdIrq.seg_394", "src_text": "In this work, we try to answer these two questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Arbeit versuchen wir, diese beiden Fragen zu beantworten:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_538.wav", "doc_id": "dvGkKzmIaN.seg_538", "src_text": "We assume the provider apply wiki text data set to count word frequency.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir nehmen an, dass der Anbieter Wikitext auf das Datensatz anwendet, um die Wortfrequenz zu zählen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_801.wav", "doc_id": "WTTtiRKFZI.seg_801", "src_text": "So here we have a dependency from \"read\" to the adjunct of length 7 measured in words and from \"read\" to \"book\" of length 4, so together it's 11.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher haben wir die Abhängigkeit von Rot bis zum Anfang des Wortes Länge sieben gemessen und von Rot bis zum Buchstaben L in einem Buchstaben L vier, also zusammen elf.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_729.wav", "doc_id": "XejEJmgUmE.seg_729", "src_text": "I'm Koustav Sinha, and I'm pleased to welcome you to our talk of our ACL 2023 paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ich bin Kostas Ioannidis und ich freue mich, Sie zu unserem Vortrag über unser ACL 2023 Paper: Language Model", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_343.wav", "doc_id": "gGbuDbHhyc.seg_343", "src_text": "This is joint work with Xiaoyu Shen, Marius Mosbach, Andreas Stephan, and Dietrich Klakow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist eine gemeinsame Arbeit mit Shaul Shalush, Marios Mouzakis, Andreas Stefan und Dietrich Klakow.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_335.wav", "doc_id": "dJGfOSFgZO.seg_335", "src_text": "They produce irrelevant information in around 15% of the responses, and they contradict themselves or their partner around 10% of the time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sie produzieren ungefähr 15 der Antworten irrelevanten Informationen. Und sie widersprechen sich selbst oder ihrem Partner etwa 10 der Zeit.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_810.wav", "doc_id": "WTTtiRKFZI.seg_810", "src_text": "And, also the observation that was made in parsing that this tendency grows with length difference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Beobachtung, die in dieser Arbeit gemacht wurde, ist, dass diese Tendenz mit zunehmender Länge des", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_474.wav", "doc_id": "SUkmfOTvGi.seg_474", "src_text": "We evaluated them on both the CoNLL-03 test sets and the CoNLL++.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben sie auf dem CORA3-Testset und dem CORA Testset bewertet. Zuletzt", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_149.wav", "doc_id": "wLqFAuDnKa.seg_149", "src_text": "Nevertheless, specialized state-of-the-art systems have a substantial advantage over the PaLM translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "haben jedoch einen erheblichen Vorteil gegenüber Palm-Übersetzungen,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_82.wav", "doc_id": "TVCREhgqUP.seg_82", "src_text": "Some other kinds of structural generalization remain very challenging, though.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Arten der Strukturverallgemeinerung bleiben jedoch sehr herausfordernd.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_82.wav", "doc_id": "TVCREhgqUP.seg_82", "src_text": "Some other kinds of structural generalization remain very challenging, though.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "andere Arten der strukturellen Veränderung erinnern sehr daran.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_721.wav", "doc_id": "oaOHnMCwad.seg_721", "src_text": "Our third recommendation is to build specialised datasets and models within 4 specific communities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "dritte Empfehlung besteht darin, spezielle Datensätze und Modelle innerhalb spezifischer Gemeinschaften zu erstellen,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_683.wav", "doc_id": "oaOHnMCwad.seg_683", "src_text": "Design biases like the one that we just saw before might occur due to the positionality of the NLP researchers and model developers.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Perspektive ist einfach die der Demografien, Identitäten und Lebenserfahrungen der", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_750.wav", "doc_id": "XejEJmgUmE.seg_750", "src_text": "And we can do the same for unacceptability case.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "können dasselbe für Unakzeptanzfälle tun.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_187.wav", "doc_id": "SLpqvupgvW.seg_187", "src_text": "Where A and B are samples from Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "A und B sind Beispiele aus Wikipedia.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_83.wav", "doc_id": "TVCREhgqUP.seg_83", "src_text": "In our paper, we solve a couple of interesting technical challenges.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In unserer Arbeit lösen wir eine paar interessante technische Herausforderungen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_526.wav", "doc_id": "dvGkKzmIaN.seg_526", "src_text": "When a user send a sentence to the provider service the provider counts the trigger number in the sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn ein Benutzer einen Satz an den Dienst des Anbieters sendet, zählt der Anbieter die Anzahl der Trigger im Satz.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_534.wav", "doc_id": "dvGkKzmIaN.seg_534", "src_text": "The cosine and L2 similarity between the requested embedding and the target embedding are computed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Kosinus- und L-Zwei-Ähnlichkeit zwischen dem angeforderten Einbetten und dem Ziel-Einbetten werden berechnet.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_876.wav", "doc_id": "GvEBWkLmuI.seg_876", "src_text": "And finally, there should really be increased transparency about bias mitigation methods, because for instance, like these positive stereotypes, we don't know if it's because there is some sort of weird overly-excessive value alignment going on, or maybe some other anti-stereotyping methods that are resulting in these pernicious patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und schließlich sollte es wirklich eine Erhöhung der Transparenz durch die Methoden der Abtrennung geben. Weil diese positiven Stereotypen so sind, wissen wir nicht, ob es daran liegt, dass sie irgendwie seltsam sind. Übermäßig überbewertet, oder vielleicht auch andere Stereotypen, die zu diesen schädlichen Mustern führen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_232.wav", "doc_id": "oYCKgTzTDy.seg_232", "src_text": "And we'll also test Monolingual Model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir testen auch ein monolinguales Modell,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_208.wav", "doc_id": "SLpqvupgvW.seg_208", "src_text": "But this is not realistic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "aber das ist nicht realistisch.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_738.wav", "doc_id": "XejEJmgUmE.seg_738", "src_text": "These days large language models are coming up with longer and longer context windows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Heutzutage kommen große Sprachmodelle mit immer längeren Kontextfenstern", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_871.wav", "doc_id": "GvEBWkLmuI.seg_871", "src_text": "So rather than actually working towards changing those obstacles, it puts pressure on those people to overcome them, which leads to a very negative health outcomes for these people, among other harms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die tatsächliche Arbeit daran, die Bedingungen zu ändern, bringt Druck auf diese Personen, was zu sehr negativen gesundheitlichen Auswirkungen für diese Personen führt. Wir", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_222.wav", "doc_id": "oYCKgTzTDy.seg_222", "src_text": "But Chinese is missing and lack of coverage on certain meaning representation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das Chinesische fehlt. Leider gibt es viele Wiedergaben, die nicht genau sind.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_274.wav", "doc_id": "PIZEXUFLAR.seg_274", "src_text": "For investigating multi-modal instruction tuning on our proposed dataset, we take OFA, a unified multi-modal pre-trained model, as our base model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um die multimodale Anpassung von Anweisungen in unserem vorgeschlagenen Datensatz zu untersuchen, verwenden wir OFA, ein einheitliches multimodales Mustermodell als unser Basismodell.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_294.wav", "doc_id": "PIZEXUFLAR.seg_294", "src_text": "As we can see, instruction tuning can significantly improve OFA's performance on seen multi-modal tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wie wir sehen können, kann die Anweisungstuning die Leistung des OFWs auf derselben Multimodaleinheit erheblich verbessern.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_670.wav", "doc_id": "FLkGnzVRew.seg_670", "src_text": "We also find that iterative update is useful for transfer learning from a different domain, whereas in domain active annotations benefit from cumulative update.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir finden auch, dass iteratives Update für das Lernen von Transfer von einem anderen Domäne nützlich ist, während in-Domäne-aktive Annotationen von kumulativem Update profitieren.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_756.wav", "doc_id": "XejEJmgUmE.seg_756", "src_text": "And we saw here in the orange dotted line, the MPP judgments are relatively stable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "hier sehen wir, dass die MP-Beurteilungen relativ stabil sind.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_826.wav", "doc_id": "WTTtiRKFZI.seg_826", "src_text": "And talk to us about at the poster session.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Argumente, Entschuldigung, und sprechen Sie mit uns über die Postsitzung, danke.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_611.wav", "doc_id": "oeooqChmKK.seg_611", "src_text": "We have defined three settings of KITMUS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir haben drei Einstellungen von Kedermus definiert:", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_850.wav", "doc_id": "GvEBWkLmuI.seg_850", "src_text": "So when people are describing a warrior who is a woman, they'll usually actually specify \"woman warrior\" and mark the term with \"woman\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "also wenn jemand einen Krieger beschreibt, der eine Frau ist, wird normalerweise spezifiziert, dass es sich um eine Frau handelt, und das Wort wird mit dem Femininum markiert.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_398.wav", "doc_id": "WBLMIsdIrq.seg_398", "src_text": "In the previous work, we introduced CXMI as a measure for context usage by machine translation models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In der vorhergehenden Arbeit stellen wir XMI als Maß für Kontexte durch Maschinenübersetzungsmodelle bereit, und", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_757.wav", "doc_id": "XejEJmgUmE.seg_757", "src_text": "Now, what happens when we choose sentences from the same data set?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was passiert nun, wenn wir Sätze aus dem gleichen Datensatz auswählen?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_685.wav", "doc_id": "oaOHnMCwad.seg_685", "src_text": "This is a concept widely used in critical studies, specifically in feminist and queer academic spaces.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies ist ein Konzept, das in kritischen Studien häufig verwendet wird, insbesondere in feministischen und queer-akademischen Räumen. Und", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_347.wav", "doc_id": "gGbuDbHhyc.seg_347", "src_text": "When compared to human annotations, the weaker annotations are much cheaper, yet they are also noisy, meaning that a certain amount of the annotations are incorrect.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Im Vergleich zu menschlichen Anmerkungen sind die schwachen Anmerkungen viel billiger, aber sie sind auch laut, was bedeutet, dass eine gewisse Menge der Anmerkungen falsch ist.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_580.wav", "doc_id": "rISrKoXQCx.seg_580", "src_text": "We would also like to highlight that we expose the unique dilemma regarding language model political biases.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "werden wir also auch darauf hinweisen, dass wir die einzigartige Delikatesse der politischen Sprache herausstellen wollen, die", "score": 48.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_192.wav", "doc_id": "SLpqvupgvW.seg_192", "src_text": "The third one is when they have similar descriptions on Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die dritte ist, wenn sie ähnliche Beschreibungen auf Wikipedia haben", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_526.wav", "doc_id": "dvGkKzmIaN.seg_526", "src_text": "When a user send a sentence to the provider service the provider counts the trigger number in the sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn ein Benutzer eine Satz an den Dienstleister sendet, zählt der Dienstleister die Auslöser-Zahl in der Satz.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_93.wav", "doc_id": "uZBWfYjYnf.seg_93", "src_text": "What is simultaneous speech translation?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Was ist simultane", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_508.wav", "doc_id": "dvGkKzmIaN.seg_508", "src_text": "However, recent works have shown that the attacker may steal the model through learning from the embedding and provide similar services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Jüngste Arbeiten haben jedoch gezeigt, dass der Angreifer das Modell durch das Lernen aus dem Embedding und die Bereitstellung ähnlicher Dienste stehlen kann.", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_777.wav", "doc_id": "WTTtiRKFZI.seg_777", "src_text": "Right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "isometrisch", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_99.wav", "doc_id": "uZBWfYjYnf.seg_99", "src_text": "For example, training a model with an average of one second latency and another one with two seconds latency, and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist beispielsweise das Training eines Modells mit einer durchschnittlichen Latenzzeit von einer Sekunde und einem anderen mit einer Latenzzeit von zwei Sekunden und so weiter.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_33.wav", "doc_id": "aQpIWggfCo.seg_33", "src_text": "Thus, we follow the idea of symbolic knowledge distillation, to distil constrained language planning datasets from large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Daher folgen wir der Idee der symbolischen Wissensdestillation, um die von großen Sprachmodellen destillierten Datenbanken von konstruktionsbeschränkten Sprachplanungsdatenbanken zu destillieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_858.wav", "doc_id": "GvEBWkLmuI.seg_858", "src_text": "So, really just only the positive or at least non-negative ones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Also wirklich nur die positiven oder zumindest nicht negativen.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_472.wav", "doc_id": "SUkmfOTvGi.seg_472", "src_text": "This is a data set that we collected from Reuters News from 2020, and then annotated them with the same CoNLL-2003 annotation guidelines.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "entwickelt: Dies ist ein Datensatz, den wir von Reuters News von 2020 gesammelt und dann mit den gleichen KERN 2003 Anmerkungslinien annotiert haben.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_146.wav", "doc_id": "wLqFAuDnKa.seg_146", "src_text": "In particular, we compare the selecting prompts from the training data for the WMT evaluations on the dev data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "insbesondere vergleichen wir die aus den Trainingsdaten der WMT-Bewertungen oder den TeDaten.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_269.wav", "doc_id": "PIZEXUFLAR.seg_269", "src_text": "There exist more than 1600 language-only instruction tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Es gibt mehr als sechshundert Sprachbefehle, aber", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_830.wav", "doc_id": "GvEBWkLmuI.seg_830", "src_text": "In recent years, many have documented the prevalence of social bias and stereotypes in large language models, or LLMs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In den letzten Jahren haben viele die Verbreitung von sozialem Vorurteil und Stereotypen in großen Sprachmodellen oder LLMs dokumentiert.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_84.wav", "doc_id": "TVCREhgqUP.seg_84", "src_text": "First of all, the alignment between input and output is not given in the training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zunächst einmal ist die Ausrichtung zwischen Eingabe und Ausgabe in den Trainingsdaten nicht gegeben.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_778.wav", "doc_id": "WTTtiRKFZI.seg_778", "src_text": "They single out one of the conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sie setzen einen der Konjunkte heraus,", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_563.wav", "doc_id": "rISrKoXQCx.seg_563", "src_text": "By further pretraining language models on such partisan corpora we can see that the ideological coordinates of the language model also correspondingly shift.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "können wir sehen, dass die ideologischen Koordinaten der Sprachmodelle auch entsprechend verschieben:", "score": 44.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_696.wav", "doc_id": "oaOHnMCwad.seg_696", "src_text": "And so we opt to re annotate data to get many annotates for instance and to get a rich set of demographic data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "werden. Und so entscheiden wir uns, Daten neu zu annotieren, um beispielsweise viele Instanzen zu erhalten und einen reichen Datensatz zu erhalten.", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_309.wav", "doc_id": "dJGfOSFgZO.seg_309", "src_text": "And I'm Sarah Finch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und ich bin Sarah Finch,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_874.wav", "doc_id": "GvEBWkLmuI.seg_874", "src_text": "First, we should, as researchers, be addressing positive stereotypes and essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ab. Zunächst sollten wir als Forscher positive Stereotypen und Narrationen ansprechen.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_216.wav", "doc_id": "oYCKgTzTDy.seg_216", "src_text": "Today I'm going to present our work \"XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Heute werde ich unsere Arbeit vorstellen: Beispiel: Krosssprachige semantische Analyse in mehreren natürlichen Sprachen und vielen Darstellungen. Semantische", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_640.wav", "doc_id": "FLkGnzVRew.seg_640", "src_text": "So why does this matter?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "kann das Studium", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_50.wav", "doc_id": "TVCREhgqUP.seg_50", "src_text": "Compositional generalization can be understood as the ability of a learner to handle deeper recursion and unseen compositions of phrases that have been seen individually during training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Kompositionsverallgemeinerung kann als die Fähigkeit verstanden werden, tiefere Wiederholungen und unsichtbare Kompositionen von Sätzen, die individuell während des Trainings gesehen werden.", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_707.wav", "doc_id": "oaOHnMCwad.seg_707", "src_text": "So now we're better equipped to answer who do NLP datasets and models align with the most.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ländern gesammelt. Also, jetzt sind wir besser ausgestattet, um zu beantworten, wer die NLP-Datensätze und Modelle am meisten ausrichtet.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_355.wav", "doc_id": "gGbuDbHhyc.seg_355", "src_text": "First, is clean validation data necessary for WSL or can we maybe use a noisy validation set instead?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Erstens: Sind saubere Validierungsdaten für WS-L necessary oder können wir vielleicht ein lautes Validierungsset anstelle dessen verwenden? Zweitens,", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_521.wav", "doc_id": "dvGkKzmIaN.seg_521", "src_text": "Watermark injection and copyright verification.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wasserzeicheninjektion und Urheberrechtsanwendung. Bevor wir diese", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_154.wav", "doc_id": "wLqFAuDnKa.seg_154", "src_text": "So, it seems that PaLM chooses to produce a better-sounding translation, sometimes by dropping parts of the source sentence that are made in translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Herr Präsident, die Kommission entscheidet sich dafür, eine bessere klingende Übersetzung zu produzieren. Manchmal? durch das Entfernen von Teilen der Sätze, die im Übersetzung fehlen.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_649.wav", "doc_id": "FLkGnzVRew.seg_649", "src_text": "On collecting around 1,000 examples of discourse unit pairs, we ran training for an initial classifier trained only on 43 examples of dissonance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Bei der Sammlung von etwa tausend Beispielen von Diskurs-Einheitenpaaren trainierten wir einen Initialklassifikator, der nur auf 43 Beispielen von Diskrepanzen trainiert wurde.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_140.wav", "doc_id": "wLqFAuDnKa.seg_140", "src_text": "We saw that the actual form of the prompting doesn't have a big influence in the case of several short promptings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir sahen, dass die tatsächliche Form der Vorstellung keinen großen Einfluss hat, wenn es sich um mehrere kurze Vorstellungen handelt.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_638.wav", "doc_id": "FLkGnzVRew.seg_638", "src_text": "And they have a consonance relationship.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "klar und sie haben eine konsistente Beziehung. Bedenken sind", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_790.wav", "doc_id": "WTTtiRKFZI.seg_790", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "weil hier", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_380.wav", "doc_id": "gGbuDbHhyc.seg_380", "src_text": "You can find it via the QR code on this slide.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Sie können ihn über den QR-Code auf dieser Seite", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_208.wav", "doc_id": "SLpqvupgvW.seg_208", "src_text": "But this is not realistic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "das ist nicht realistisch.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_560.wav", "doc_id": "rISrKoXQCx.seg_560", "src_text": "We can also see that GPT-4 is the most liberal language model of them all, and GPT series are generally more socially liberal than BART series and its variants.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir können auch sehen, dass GPT-4 der liberaleste Sprachmodell aller ist und GPT-Theorien allgemein sozialliberaler sind als BERT-Theorien und ihre Variationen. 2", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_757.wav", "doc_id": "XejEJmgUmE.seg_757", "src_text": "Now, what happens when we choose sentences from the same data set?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Was passiert nun, wenn wir Sätze aus demselben Datensatz auswählen?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_218.wav", "doc_id": "oYCKgTzTDy.seg_218", "src_text": "And Cross-Lingual Semantic Parsing is the task to translate queries in multiple natural languages into multiple meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Übersetzung von Quellen in mehreren natürlichen Sprachen in mehrere Bedeutungsrepräsentationen ist die Aufgabe der Übersetzung von Quellen in mehreren natürlichen Sprachen in mehrere Bedeutungsrepräsentationen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_826.wav", "doc_id": "WTTtiRKFZI.seg_826", "src_text": "And talk to us about at the poster session.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und die Argumente an, und sprechen Sie uns am Postersession", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_63.wav", "doc_id": "TVCREhgqUP.seg_63", "src_text": "This can be complicated and sometimes a computationally expensive process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies kann ein komplizierter und manchmal computergestützter Prozess sein.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_252.wav", "doc_id": "oYCKgTzTDy.seg_252", "src_text": "While the green line is the Monolingual Setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "grüne Linie", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_323.wav", "doc_id": "dJGfOSFgZO.seg_323", "src_text": "To determine what kind of evaluation is most effective, we selected four state-of-the-art chat models and evaluated them on 100 human-bot conversations per model using ABC-Eval.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um zu bestimmen, welche Art der Bewertung am effektivsten ist, haben wir vier hochmoderne Chat-Modelle ausgewählt und sie auf 1000 menschlicher-Bot-Konversationen pro Modell mit ABC-Eval bewertet.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_49.wav", "doc_id": "TVCREhgqUP.seg_49", "src_text": "This is joint work with my advisors Alexander Koller and Ivan Titov.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist eine gemeinsame Arbeit mit meinen Beratern Alexander Koller und Ivan Titov.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_650.wav", "doc_id": "FLkGnzVRew.seg_650", "src_text": "To no surprise, the classifier performed not much better than chance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "der ersten drei Beispiele, keine Überraschung, die Klassifizierung funktioniert nicht. Mit der geringen", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_249.wav", "doc_id": "oYCKgTzTDy.seg_249", "src_text": "We also compare the cross-language performance gap.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir vergleichen auch die Kreuzsprachleistung.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_18.wav", "doc_id": "aQpIWggfCo.seg_18", "src_text": "We dig into a more fine-grained topic categories of constraints defined in wikiHow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir gehen auf mehr ausgeprägte Themenbereiche der Einschränkungen ein, die sich aus der Art und Weise ergeben, wie Mädchen in verschiedenen Kategorien unterrichtet werden.", "score": 11.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_552.wav", "doc_id": "rISrKoXQCx.seg_552", "src_text": "So on one hand, they were able to learn from diverse perspectives, which celebrates democracy and the plurality of ideas.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sie aus verschiedenen Perspektiven, die Demokratie und die Vielfalt ihrer Ideen feiern,", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_772.wav", "doc_id": "WTTtiRKFZI.seg_772", "src_text": "As you may know, there are different dependency structures assumed by different theories and corpus approaches.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wie man sieht, sind die unterschiedlichen Abhängigkeitsstrukturen durch unterschiedliche Theorien und Kopfprozesse gegeben, so", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_25.wav", "doc_id": "aQpIWggfCo.seg_25", "src_text": "We convert scripts and goals into InstructGPT embeddings and calculate the cosine similarity as similarity scores to measure semantic similarity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir konvertieren Skripte und Ziele in explizite GPT-Embeddings und berechnen Kosinussimilitäts- und Ähnlichkeitsscores, um semantische Ähnlichkeit zu messen.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_490.wav", "doc_id": "SUkmfOTvGi.seg_490", "src_text": "So what about temporal drift then?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und wie steht es mit Temperaturen? Für", "score": 2.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_607.wav", "doc_id": "oeooqChmKK.seg_607", "src_text": "First, entity-specific knowledge such as \"Servin is a judge.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "erstens, die spezifische Kenntnis der Einheit, wie etwa, dass ein", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_565.wav", "doc_id": "rISrKoXQCx.seg_565", "src_text": "And we also try to investigate whether language models can pick up the polarisation that's prevalent in our modern society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und wir versuchen auch, zu untersuchen, ob Sprachmodelle die Polarisation aufgreifen können, die in unserer modernen Gesellschaft vorherrscht, also", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_624.wav", "doc_id": "oeooqChmKK.seg_624", "src_text": "When trained on KITMUS, however, both C2F and BERT4Coref perform significantly better than the random choice.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wurden jedoch auf Kidmus trainiert. Beide \"Sea to Earth\" und \"Butterfly\" liefern eine deutlich bessere Leistung als \"Duran Duran", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_866.wav", "doc_id": "GvEBWkLmuI.seg_866", "src_text": "So for example, the words describing Latina women include things like \"vibrant\" and \"curvaceous\" which connect to a trope of tropicalism.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "von Farbe, so dass zum Beispiel die Wörter, die eine lateinamerikanische Frau beschreiben, Dinge wie lebhaft und kokett enthalten. - was sich mit einem Tropus des Tropikalismus verbindet. Für", "score": 47.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_215.wav", "doc_id": "oYCKgTzTDy.seg_215", "src_text": "Hello everyone, my name is Yusen Zhang from the Penn State University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo alle, mein Name ist Usin John von der Penn State University.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_9.wav", "doc_id": "aQpIWggfCo.seg_9", "src_text": "A good planner should write scripts that are reasonable and faithful to constraints.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ein guter Planer sollte Skripte schreiben, die vernünftig und den Einschränkungen treu sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_543.wav", "doc_id": "dvGkKzmIaN.seg_543", "src_text": "That's all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das ist alles, vielen", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_177.wav", "doc_id": "SLpqvupgvW.seg_177", "src_text": "In the first bubble, Bob says, \"Remember that song we were listening to yesterday?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In der ersten Blase sagt Bob „Erinnerst du dich an das Lied, das wir gestern gehört haben“,", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_53.wav", "doc_id": "TVCREhgqUP.seg_53", "src_text": "In this case, \"The girl slept.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In diesem Fall schlief das Mädchen und", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_378.wav", "doc_id": "gGbuDbHhyc.seg_378", "src_text": "Third, continuous fine-tuning is a simple yet strong baseline that should be considered in future work in WSL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "drittens ist kontinuierliche Feinabstimmung eine einfache, aber starke Basislinie, die bei zukünftiger Arbeit in WSAL berücksichtigt werden sollte.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_543.wav", "doc_id": "dvGkKzmIaN.seg_543", "src_text": "That's all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Das ist alles.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_711.wav", "doc_id": "oaOHnMCwad.seg_711", "src_text": "We find that Dynahate is also most aligned to English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir finden auch heraus, dass Dina-Hat am meisten mit englischsprachigen Ländern ausgerichtet ist.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_559.wav", "doc_id": "rISrKoXQCx.seg_559", "src_text": "They occupy all four quadrants on the political campus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "4", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_7.wav", "doc_id": "aQpIWggfCo.seg_7", "src_text": "In this paper, we define the problem of constrained language planning which imposes different constraints on the goals of planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In diesem Papier definieren wir das Problem der begrenzten Sprachplanung. Das setzt unterschiedliche Beschränkungen für die Ziele der Planung voraus.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_440.wav", "doc_id": "hgIDlKNiFM.seg_440", "src_text": "Specialized models for other languages are scarce and are often based on continual pre-training due to the lack of in-domain data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "andere Sprachen sind rar und basieren oft auf kontinuierlichem Training aufgrund des Mangels an Domänen-Daten. Jedoch hatte French bis jetzt kein", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_676.wav", "doc_id": "oaOHnMCwad.seg_676", "src_text": "This work was done in collaboration with some folks at the University of Washington and the Allen Institute for AI, namely Sebastian Santy, Ronan Le Bras, Katharina Reinecke and Maarten Sap.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Arbeit wurde in Zusammenarbeit mit der University of Washington und dem Institut für A.I. Sebastian Seung, Ronan Brass, Katrina Arikan und Martin Sapp", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_788.wav", "doc_id": "WTTtiRKFZI.seg_788", "src_text": "So in English, as you might know, direct objects prefer to be close to the verb, while adjuncts may be further away.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "in Englisch, wie Sie vielleicht wissen, bevorzugen direkte Objekte es, sich dem Verb zu nähern, während Adjunkte möglicherweise weiter entfernt sind, wie", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_389.wav", "doc_id": "WBLMIsdIrq.seg_389", "src_text": "But if the previous sentence was \"Could it be anything serious, doctor?\", then \"mole\" refers to a birthmark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "aber wenn der vorherige Satz war, 'Könnte es etwas Ernstes sein, Doktor', dann bezieht sich Mole auf eine Mole.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_385.wav", "doc_id": "WBLMIsdIrq.seg_385", "src_text": "This work was done in collaboration with Patrick Fernandes, Emmy Liu, André F. T. Martins, and Graham Neubig.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Arbeit wurde in Zusammenarbeit mit Patrick Fernandes, MEU, André F. Martins und Graham Newick durchgeführt.", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_341.wav", "doc_id": "gGbuDbHhyc.seg_341", "src_text": "Hello, I am Dawei, a PhD student at Saarland University in Germany.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo, ich bin Dawe, Doktorand an der Universität St. Gallen in Deutschland.", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_117.wav", "doc_id": "uZBWfYjYnf.seg_117", "src_text": "And we see that it outperforms all the strategies applied to offline models since the curves are shifted over the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und wir sehen, dass ein Edelstein alle Strategien, die auf Online-Modelle angewandt werden, übertrifft, da seine Krümmungen nach links verschoben sind.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_202.wav", "doc_id": "SLpqvupgvW.seg_202", "src_text": "For example, the one with the piano music.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zum Beispiel die mit Klaviermusik:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_854.wav", "doc_id": "GvEBWkLmuI.seg_854", "src_text": "Now for some results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um nun einige Ergebnisse", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_250.wav", "doc_id": "oYCKgTzTDy.seg_250", "src_text": "In this figure, the blue line is Cross-lingual Few-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Abbildung ist die blaue Linie die Kreuzsprachübertragung, die", "score": 27.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_732.wav", "doc_id": "XejEJmgUmE.seg_732", "src_text": "So in this work, we revisit the minimal pair paradigms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher besprechen wir in diesem Werk das Minimal-Paarpfadogon. Daher", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_507.wav", "doc_id": "dvGkKzmIaN.seg_507", "src_text": "For example, OpenAI offers a GPT based embedding API.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist Embedding Services. Zum Beispiel bietet OpenNLP eine GPD-basierte Embedding-API. Jedoch haben jüngste", "score": 44.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_809.wav", "doc_id": "WTTtiRKFZI.seg_809", "src_text": "So, \"salt and pepper\" and not \"pepper and salt\", measured in syllables.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "dass die linken Konjunktionen kürzer sind als die rechten Konjunktionen.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_608.wav", "doc_id": "oeooqChmKK.seg_608", "src_text": "And second, background knowledge such as \"Judges decide cases in law courts.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Kenntnisse über die Entität hat, zum Beispiel, dass ein Diener ein Richter ist. Zweitens: Hintergrundwissen, wie z.B. dass Richter in Zivilgerichten", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_527.wav", "doc_id": "dvGkKzmIaN.seg_527", "src_text": "The provided embedding is a weight summation of the target embedding and the original embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die angegebene Einbettung ist eine Weißsumme der Ziel-Einbettung und der ursprünglichen Einbettung.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_65.wav", "doc_id": "TVCREhgqUP.seg_65", "src_text": "Obtaining trees may also involve specialized grammar-induction procedures.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das Erstellen von Stichproben kann auch spezialisierte Grammatikprozesse beinhalten.", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_333.wav", "doc_id": "dJGfOSFgZO.seg_333", "src_text": "You can see that in the results of our experiment that several challenges still remain and have been precisely quantified.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sie können in den Ergebnissen unseres Experiments sehen, dass einige Herausforderungen noch bestehen und genau quantifiziert wurden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_840.wav", "doc_id": "GvEBWkLmuI.seg_840", "src_text": "The Asian woman is depicted as unassuming; the Middle-Eastern woman is referred to using words like exotic and like, referring to a mesmerizing region.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die asiatische Frau wird als unverfänglich dargestellt, die Frau aus dem Nahen Osten wird mit Worten wie exotisch bezeichnet und auf eine faszinierende Region angespielt.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_660.wav", "doc_id": "FLkGnzVRew.seg_660", "src_text": "Over the different strategies, we found that Cumulative performed equal or better than Iterative across the board.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die gesammelt wurden. Über die verschiedenen Strategien haben wir festgestellt, dass eine kumulative Leistung gleich oder besser ist als eine iterative", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_651.wav", "doc_id": "FLkGnzVRew.seg_651", "src_text": "Given the low occurrence of dissonance and absence of any prior such data set, we are facing the problem of absolute rarity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Häufigkeit und Abwesenheit jeglicher vorhergehender Datensätze stehen wir vor dem Problem der absoluten Häufigkeit.", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_199.wav", "doc_id": "SLpqvupgvW.seg_199", "src_text": "For the recipes and books domain, we show some background text from Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "das Rezept- und Buchdomäne zeigen wir einige Hintergrundtexte von Wikipedia;", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_328.wav", "doc_id": "dJGfOSFgZO.seg_328", "src_text": "For example, you can see how measuring the proportion of turns with self and partner contradictions explains 5% and 10% of conversation quality, respectively, while the average Likert consistency scores explain only 4% or less.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wird. Zum Beispiel können Sie sehen, wie die Messung des Verhältnisses von Selbst- und Partnerkontraktionen mit fünf Prozent und zehn Prozent der Gesprächsqualität respektabel ist, während die durchschnittlichen Liker-Konsistenzwerte nur vier Prozent oder weniger zeigen.", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_415.wav", "doc_id": "WBLMIsdIrq.seg_415", "src_text": "For each of the five discourse phenomena we identified, we create taggers to automatically identify words that pertain to the phenomenon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Für jedes der fünf Diskursphänomene, die wir identifiziert haben, erstellen wir einen Tag, der die Wörter automatisch identifiziert, die zum Phänomen gehören,", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_541.wav", "doc_id": "dvGkKzmIaN.seg_541", "src_text": "The legend of the figures means the number of triggers in each sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Legende der Figuren bedeutet die Anzahl der Auslöser in jedem Satz.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_180.wav", "doc_id": "SLpqvupgvW.seg_180", "src_text": "Which is the alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das ist die alternative", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_226.wav", "doc_id": "oYCKgTzTDy.seg_226", "src_text": "We provide a uniform data set XSemPLR for cross-lingual semantic parsing in multiple natural languages and meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "vor, das ein einheitliches Datensatz-Beispiel für die semantische Verlinkung in mehreren natürlichen Sprachen und Bedeutungsrepräsentationen bietet. Es", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_251.wav", "doc_id": "oYCKgTzTDy.seg_251", "src_text": "The orange line is Cross-lingual Zero-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "krosssprachige Nullübertragung, während die grüne Linie eine einsprachige", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_187.wav", "doc_id": "SLpqvupgvW.seg_187", "src_text": "Where A and B are samples from Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wobei A und B Beispiele aus Wikipedia sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_221.wav", "doc_id": "oYCKgTzTDy.seg_221", "src_text": "For instance, there are lots of coverage on certain natural languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "gibt einen Mangel an bestimmten natürlichen Sprachen, das Chinesische", "score": 14.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_425.wav", "doc_id": "WBLMIsdIrq.seg_425", "src_text": "But these models are not much better than models that do not use context on other phenomena like ellipsis, pronouns, and verb form.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "bestimmte Diskursphänomene verwenden, wie z. B. Formalität und lexikalische Kohäsion. Diese Modelle sind jedoch nicht viel besser als Modelle, die nicht auf anderen Phänomenen", "score": 39.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_114.wav", "doc_id": "uZBWfYjYnf.seg_114", "src_text": "And we compare with popular strategies that are also applied to offline models that are the Wait-k strategy and the Local Agreement.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "werden. Wir vergleichen außerdem mit den geeigneten Strategien, die auch auf Offline-Modelle angewendet werden, nämlich der Whitkey-Strategie und dem lokalen Abkommen,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_756.wav", "doc_id": "XejEJmgUmE.seg_756", "src_text": "And we saw here in the orange dotted line, the MPP judgments are relatively stable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wir sehen hier in der orangen Linie, dass die MP-PP-Bewertungen relativ stabil sind.", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_190.wav", "doc_id": "SLpqvupgvW.seg_190", "src_text": "The first one is uniform at random.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der erste ist ein Uniformträger.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_500.wav", "doc_id": "dvGkKzmIaN.seg_500", "src_text": "Hello everyone, my name is Jingwei Yi from the University of Science and Technology of China.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo alle, mein Name ist Jingwei von der Universität für Wissenschaft und Technologie in China.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_131.wav", "doc_id": "wLqFAuDnKa.seg_131", "src_text": "We use state-of-the-art, neural MT metrics, and additionally also show expert-based human evaluation results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir verwenden hochmoderne LMT-Metriken und zeigen außerdem Experten-basierte Ergebnisse der menschlichen Evaluation. Schließlich", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_853.wav", "doc_id": "GvEBWkLmuI.seg_853", "src_text": "So for instance, for the personas of black women, we would do Fightin’ Words and compare the log-odds ratios against both white personas and man personas because those are the two corresponding unmarked groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel würden wir für die Personas der schwarzen Frauen Wortgefechte führen und die Logarithmen gegenüber den weißen Personas und den Personas der Männer vergleichen, weil diese beiden korrespondierenden Gruppen entsprechen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_93.wav", "doc_id": "uZBWfYjYnf.seg_93", "src_text": "What is simultaneous speech translation?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Was ist simultane", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_605.wav", "doc_id": "oeooqChmKK.seg_605", "src_text": "The task here is to identify the correct entity that the pronoun \"he\" refers to, which in this case is Servin.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Aufgabe hier besteht darin, die korrekte Einheit zu identifizieren, auf die sich das Pronomen bezieht, was in diesem Fall der", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_574.wav", "doc_id": "rISrKoXQCx.seg_574", "src_text": "Similar trends also happen for fake news detection, where we see that left-leaning language models are better at detecting misinformation from their opposite political leaning and vice versa.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ähnliche Trends treten auch bei der Fehlinformationserkennung auf, bei der wir sagen, dass die Sprachmodelle besser sind, um Fehlinformationen aus dem Gegenteil zu erkennen. Das wird", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_523.wav", "doc_id": "dvGkKzmIaN.seg_523", "src_text": "The trigger set is a group of words in a moderate frequency interval.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Trigger-Set ist eine Gruppe von Wörtern in einem moderaten Frequenzintervall.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_408.wav", "doc_id": "WBLMIsdIrq.seg_408", "src_text": "And similarly, we find that certain languages also require context when we want to choose the appropriate verb form.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "können. Und ebenso stellen wir fest, dass bestimmte Sprachen auch einen Kontext erfordern, wenn wir eine geeignete Verbform wählen wollen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_509.wav", "doc_id": "dvGkKzmIaN.seg_509", "src_text": "Therefore, it's necessary to protect the copyright of embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Daher ist es notwendig, das Urheberrecht für Embedding-Dienste zu schützen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_368.wav", "doc_id": "gGbuDbHhyc.seg_368", "src_text": "Finally, the performance improvement claimed in previous WSL approaches can be easily achieved by allowing to continue fine-tuning on the clean validation samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Schließlich kann die in früheren WSL-Ansätzen behauptete Leistungsverbesserung leicht erreicht werden, indem man es erlaubt, weiter zu fine-tunen, wenn man sich auf die sauberen Validierungsmuster konzentriert.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_380.wav", "doc_id": "gGbuDbHhyc.seg_380", "src_text": "You can find it via the QR code on this slide.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Sie können ihn über den QR-Code auf dieser Folie", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_387.wav", "doc_id": "WBLMIsdIrq.seg_387", "src_text": "For example, how would we translate \"mole\" in this sentence?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wie würden wir diesen Satz besser übersetzen?", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_362.wav", "doc_id": "gGbuDbHhyc.seg_362", "src_text": "This indicates that WSL approaches actually require cleanly labeled data to work properly, and the annotation cost for obtaining clean validation samples should not be overlooked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies zeigt, dass WSL-Ansätze tatsächlich saubere Daten erfordern, um richtig zu funktionieren, und die Anmerkungskosten für die Erlangung von sauberen Validierungsmustern sollten nicht übersehen werden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_609.wav", "doc_id": "oeooqChmKK.seg_609", "src_text": "Generally, background knowledge is learned during the pretraining of large language models, while entity-specific knowledge is typically observed at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "entscheiden. Allgemein wird Hintergrundwissen während des Trainings großer Sprachmodelle gelernt, während spezifisches Wissen typischerweise zu einem bestimmten Zeitpunkt beobachtet wird.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_290.wav", "doc_id": "PIZEXUFLAR.seg_290", "src_text": "If it's a multi-modal generation task, we report Rouge-L. For NLP task, we report Rouge-L as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "bei einer multimodalen Generierungsaufgabe geben wir „ru“ an, bei Np-Aufgaben geben wir „ru“ an.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_5.wav", "doc_id": "aQpIWggfCo.seg_5", "src_text": "However, previous work mainly focuses on planning for the abstract goals of stereotypical activities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Allerdings konzentrierte sich die vorherige Arbeit hauptsächlich auf die Planung von abstrakten Zielen von stereotypischen Aktivitäten.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_74.wav", "doc_id": "TVCREhgqUP.seg_74", "src_text": "Conceptually, our permutation model works roughly like this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Konzeptionell funktioniert unser Permutationsmodell ungefähr so.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_790.wav", "doc_id": "WTTtiRKFZI.seg_790", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "hier", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_106.wav", "doc_id": "uZBWfYjYnf.seg_106", "src_text": "A word is emitted if the attention is not concentrated, that is, its sum is below a certain threshold alpha towards the last lambda speech frames, meaning that the received information is enough stable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ein Wort wird ausgesprochen, wenn die Spannung nicht konzentriert ist, d. h. wenn die Summe unter einer bestimmten Schwellenwerte liegt, was bedeutet, dass die empfangene Information stabil ist.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_636.wav", "doc_id": "FLkGnzVRew.seg_636", "src_text": "This belief and action are inconsistent, and they are in dissonance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese Überzeugung und Handlung sind inkonsistent und widersprechen sich gegenseitig. Zusätzlich,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_167.wav", "doc_id": "SLpqvupgvW.seg_167", "src_text": "But sometimes an indirect reference is more appropriate to have a more natural conversation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "manchmal ist eine indirekte Referenz angemessener für ein natürliches Gespräch, was passiert,", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_28.wav", "doc_id": "aQpIWggfCo.seg_28", "src_text": "With our method, InstructGPT can generate scripts of higher quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unsere Methode kann mit hoher Präzision sowohl", "score": 15.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_394.wav", "doc_id": "WBLMIsdIrq.seg_394", "src_text": "In this work, we try to answer these two questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "stützen. Bei dieser Arbeit versuchen wir, diese beiden Fragen zu beantworten:", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_445.wav", "doc_id": "hgIDlKNiFM.seg_445", "src_text": "Is it 4 gigabytes, 8 gigabytes, or more?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Es sind vier Gigabyte, acht Gigabyte", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_133.wav", "doc_id": "wLqFAuDnKa.seg_133", "src_text": "The prompting has a big influence on the performance of the LLMs for translation, as we can see in a simple experiment, where we used one-shot prompting and provided two different prompts for each sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Prompting hat einen großen Einfluss auf die Leistung der ELMs für die Übersetzung, wie wir in einem einfachen Experiment sehen können, bei dem wir ein Prompting mit einem einzigen Stichwort verwendet haben und zwei verschiedene Prompts für den Satz bereitgestellt haben.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_608.wav", "doc_id": "oeooqChmKK.seg_608", "src_text": "And second, background knowledge such as \"Judges decide cases in law courts.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ein Diener ein Richter ist, und zweitens Hintergrundwissen, wie z. B. dass Richter Fälle in Gesetzgebungsorganen", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_374.wav", "doc_id": "gGbuDbHhyc.seg_374", "src_text": "Our concrete recommendations for future work are as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unsere konkreten Empfehlungen für zukünftige Arbeiten sind wie folgt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_866.wav", "doc_id": "GvEBWkLmuI.seg_866", "src_text": "So for example, the words describing Latina women include things like \"vibrant\" and \"curvaceous\" which connect to a trope of tropicalism.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Frau mit Farbe, also zum Beispiel die Wörter, die von einer lateinamerikanischen Frau geschrieben werden. „Das, was mit dem Tropismus zusammenhängt“, sind", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_118.wav", "doc_id": "uZBWfYjYnf.seg_118", "src_text": "And we also see that if we consider the actual elapsed time or the computational-aware time, that is the fastest strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir sehen auch, dass, wenn wir die tatsächliche Laufzeit oder die Berechnungszeit in Betracht ziehen, die schnellste Strategie ist.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_19.wav", "doc_id": "aQpIWggfCo.seg_19", "src_text": "The heat map in the figure shows that the planning performance of InstructGPTs varies considerably for goals of different categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Kopfzeile in der Abbildung zeigt, dass die Planungsleistung von Lehrveranstaltungen für Mädchen unterschiedlicher Kategorien erheblich variiert.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_64.wav", "doc_id": "TVCREhgqUP.seg_64", "src_text": "Typically, this involves considerable formalism-specific pre-processing of the logical forms, for example, to handle variable symbols.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist, das typischerweise eine beträchtliche formalism-spezifische Vorverarbeitung der logischen Formen beinhaltet, beispielsweise zur Handhabung von variablen Symbolen. Das Erwerben", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_622.wav", "doc_id": "oeooqChmKK.seg_622", "src_text": "In this figure, we show the results of the best-performing models on the most difficult variant of the Background-Pretrain setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Abbildung zeigen wir die Ergebnisse der besten Modelle in der schwierigsten Variante der Trainingsphase. Das", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_117.wav", "doc_id": "uZBWfYjYnf.seg_117", "src_text": "And we see that it outperforms all the strategies applied to offline models since the curves are shifted over the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir sehen, dass Adult alle Strategien, die auf Offline-Modellen angewendet werden, übertrifft, da ihre Kurven nach links geneigt sind.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_541.wav", "doc_id": "dvGkKzmIaN.seg_541", "src_text": "The legend of the figures means the number of triggers in each sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Legende der Figuren bedeutet die Anzahl der Auslöser in jedem Satz.", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_509.wav", "doc_id": "dvGkKzmIaN.seg_509", "src_text": "Therefore, it's necessary to protect the copyright of embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher ist es notwendig, den Urheberrechtsschutz von Einfügung als Diensten zu schützen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_116.wav", "doc_id": "uZBWfYjYnf.seg_116", "src_text": "These are all the results of the simultaneous speech translation strategy on German.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sind alle Ergebnisse der Simultanübersetzungsstrategie auf Deutsch.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_724.wav", "doc_id": "oaOHnMCwad.seg_724", "src_text": "You know, all technologies work for everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Technologien gilt.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_370.wav", "doc_id": "gGbuDbHhyc.seg_370", "src_text": "However, if we allow to continue fine-tuning on the clean samples, then FTw performs equally well as other methods.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn wir jedoch weiterhin Feinabstimmungen an den Proben durchführen dürfen, funktioniert FTTW genauso gut wie andere Methoden.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_566.wav", "doc_id": "rISrKoXQCx.seg_566", "src_text": "So we divide pretraining corpora, into pre 45th president of the United States and after 45th president of the United States.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "teilen die Vorbereitungssprache daher in zwei unterschiedliche Vorbereitungssprachen auf, nämlich die des vierundfünfzigsten Präsidenten der Vereinigten Staaten", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_330.wav", "doc_id": "dJGfOSFgZO.seg_330", "src_text": "You can see how the combination of all ABC-Eval metrics explains over 25% of conversation quality, and as you remove the metrics one at a time, most of them result in losing a decent amount of information about the quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sehen, wie die Kombination aller A.B.C.E.-Werte über fünfundzwanzig Prozent der Konversationsqualität ausmacht, und wenn Sie die Metriken zu einem Zeitpunkt entfernen, ergibt sich die meiste Information über die Qualität. Auf", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_447.wav", "doc_id": "hgIDlKNiFM.seg_447", "src_text": "In addition to this comparison, we introduced three models trained on continual pre-training to analyze the impact of pre-training strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zusätzlich zu dieser Vergleichsarbeit führen wir drei Modellbahnen für die kontinuierliche Ausbildung ein, um die Auswirkungen der Ausbildung zu analysieren.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_322.wav", "doc_id": "dJGfOSFgZO.seg_322", "src_text": "For example, ABC-Eval measures the number of turns in which a chat model ignores its partner or says something irrelevant, contradicts itself or its partner, hallucinates incorrect facts or violates common sense knowledge, and when the model succeeds or fails to show empathy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel misst B C E die Anzahl der Drehungen, in denen ein Chatpartner etwas Irrelevantes sagt. Es widerspricht sich selbst oder seinem Partner, halluziniert falsche Fakten oder verletzt das gesunde Menschenverstand, und wenn das Modell Erfolg hat oder nicht empathisch ist.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_664.wav", "doc_id": "FLkGnzVRew.seg_664", "src_text": "Note that the performance is significantly lower for random.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "nämlich, dass die Leistung signifikant niedriger ist.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_630.wav", "doc_id": "oeooqChmKK.seg_630", "src_text": "If you're interested in more details, please see our paper and check out the data set and code on GitHub.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn Sie an mehr Details interessiert sind, lesen Sie bitte unser Papier und sehen Sie sich den Datensatz und den Code auf GitHub an.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_602.wav", "doc_id": "oeooqChmKK.seg_602", "src_text": "Kea is a Baker.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Richter, Kiah ist ein", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_346.wav", "doc_id": "gGbuDbHhyc.seg_346", "src_text": "Instead, we label the data using weak labeling sources, such as simple heuristic rules, knowledge bases, or low-quality crowdsourcing, as illustrated in the figure on the right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "schwachen Markierungskomponenten markiert, wie z. B. einfache heuristische Regeln, Wissensbasen oder niedrigwertige Quellensammlungen, wie in der Abbildung rechts gezeigt.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_253.wav", "doc_id": "oYCKgTzTDy.seg_253", "src_text": "We found that, by comparing the green and orange line, we found the Zero-shot setting, the Cross-lingual transfer performance gap is significant, and then comparing the blue and orange lines, we found that with the Few-shot setting the transfer gap is shortened rapidly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "der Vergleichsuntersuchung der grünen und orangenen Linie für die Nullschussstellung der Leistungstransfer zwischen den Linsen eine erhebliche Abweichung der Leistungstransferleistung besteht, und bei der Vergleichsuntersuchung der blauen und orangenen Linie stellten wir fest, dass bei wenigen Schüssen die Abweichung der Leistungstransferleistung schnell verringert wird.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_478.wav", "doc_id": "SUkmfOTvGi.seg_478", "src_text": "The first one is the model architecture.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der erste ist die Modellarchitektur.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_402.wav", "doc_id": "WBLMIsdIrq.seg_402", "src_text": "Now we analyze words with high P-CXMI to look for patterns between these words.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Nun analysieren wir die Wörter mit hoher Bedeutung, um nach Mustern zwischen diesen Wörtern zu suchen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_673.wav", "doc_id": "FLkGnzVRew.seg_673", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Danke.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_338.wav", "doc_id": "dJGfOSFgZO.seg_338", "src_text": "We hope ABC-Eval can be leveraged by others in the field as a meaningful step in this direction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir hoffen, dass ABC-Eval von anderen im Bereich genutzt werden kann, als ein bedeutender Schritt in diese Richtung, und wir freuen", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_739.wav", "doc_id": "XejEJmgUmE.seg_739", "src_text": "So it's crucial that we evaluate the models' acceptability throughout the context window and that is what we are trying to do here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und längeren Kontextfenstern, also ist es wichtig, dass wir die Akzeptabilität der Modelle über die Kontextfenster bewerten. Und das ist das, was wir hier", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_402.wav", "doc_id": "WBLMIsdIrq.seg_402", "src_text": "Now we analyze words with high P-CXMI to look for patterns between these words.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Jetzt analysieren wir Wörter mit hohen PSMI, um Muster zwischen diesen Wörtern zu finden.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_259.wav", "doc_id": "oYCKgTzTDy.seg_259", "src_text": "And our results show many interesting findings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und unsere Ergebnisse zeigen viele interessante Erkenntnisse.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_729.wav", "doc_id": "XejEJmgUmE.seg_729", "src_text": "I'm Koustav Sinha, and I'm pleased to welcome you to our talk of our ACL 2023 paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ich bin Cosette und ich freue mich, Sie in unserem Gespräch über unser Papier „Language Model Acceptability Judgments“ begrüßen zu dürfen.", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_782.wav", "doc_id": "WTTtiRKFZI.seg_782", "src_text": "And finally, there's also a multi-headed approach that's used, for example, in the Hudson's Word Grammar, where they say all conjuncts are heads of the coordinate structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und schließlich ist dies auch ein mehrschrittiger Ansatz, der beispielsweise in der Katzensprache verwendet wird. Wie gesagt, alle Konjunktionen haben die Konjunkturstruktur, also", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_686.wav", "doc_id": "oaOHnMCwad.seg_686", "src_text": "And as a researcher, positionality can influence the research process and its outcomes and results because it can change the decisions that researchers make.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und als Forscherin kann die Positionierung den Forschungsprozess und seine Ergebnisse beeinflussen, weil sie die Entscheidungen der Forscher verändern kann.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_673.wav", "doc_id": "FLkGnzVRew.seg_673", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wenn Sie Fragen haben. Danke.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_105.wav", "doc_id": "uZBWfYjYnf.seg_105", "src_text": "Our solution is to propose EDAtt, or Encoder-Decoder Attention, and it is a strategy for which we decide whether to emit or not a partial translation, based on where attention points to.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unsere Lösung ist eine Strategie, bei der wir entscheiden, ob wir eine Teilübersetzung oder eine vollständige Übersetzung vornehmen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_814.wav", "doc_id": "WTTtiRKFZI.seg_814", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "in", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_685.wav", "doc_id": "oaOHnMCwad.seg_685", "src_text": "This is a concept widely used in critical studies, specifically in feminist and queer academic spaces.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist ein weit verbreitetes Konzept in kritischen Studien, insbesondere in feministischen und queeren akademischen Bereichen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_560.wav", "doc_id": "rISrKoXQCx.seg_560", "src_text": "We can also see that GPT-4 is the most liberal language model of them all, and GPT series are generally more socially liberal than BART series and its variants.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "auch sehen, dass GPT-4 das freizügigste Sprachmodell ist, und GPT-Theorien sind im Allgemeinen freizügiger als Theorien.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_624.wav", "doc_id": "oeooqChmKK.seg_624", "src_text": "When trained on KITMUS, however, both C2F and BERT4Coref perform significantly better than the random choice.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "doch beim Training auf dem Kidus schnitten sowohl C to F als auch B to F deutlich besser ab als bei der zufälligen Auswahl.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_609.wav", "doc_id": "oeooqChmKK.seg_609", "src_text": "Generally, background knowledge is learned during the pretraining of large language models, while entity-specific knowledge is typically observed at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Im Allgemeinen wird Hintergrundwissen während des Vorbereitungsprozesses von großen Sprachmodellen erlernt, während spezifische Kenntnisse typischerweise aufgefrischt werden.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_421.wav", "doc_id": "WBLMIsdIrq.seg_421", "src_text": "But then if we use COMET, context-aware models perform best.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "erzielen, aber wenn wir Commnet verwenden, erzielen Kontext-bewusste Modelle die beste Leistung.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_3.wav", "doc_id": "aQpIWggfCo.seg_3", "src_text": "Previous work has exploited language models to plan for abstract goals of stereotypical activities such as \"make a cake\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "früheren Arbeiten wurden Sprachmodelle genutzt, um abstrakte Ziele stereotyper Aktivitäten wie „Make a Kick“ zu planen,", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_549.wav", "doc_id": "rISrKoXQCx.seg_549", "src_text": "Political news media are well covered in their pretraining data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Politische Nachrichtenmedien sind in der Vorschau enthalten, wie", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_281.wav", "doc_id": "PIZEXUFLAR.seg_281", "src_text": "For testing, we reserve the entire common sense reasoning group for testing, and we select additional 5 tasks from VQ and Miscellaneous groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "für die Tests vor, wobei wir eine Gruppe mit gesamtem Menschenverstand für die Tests vorbehalten und wir weitere fünf Aufgaben aus der Gruppe VQV und der Gruppe", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_388.wav", "doc_id": "WBLMIsdIrq.seg_388", "src_text": "Well, if the previous sentence was \"Things could start to get dangerous if the ministers find out\", then \"mole\" refers to a spy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Nun, wenn der vorherige Satz war, 'Dinge könnten gefährlich werden, wenn die Minister es herausfinden', dann bezieht sich Mole auf einen Spion,", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_771.wav", "doc_id": "WTTtiRKFZI.seg_771", "src_text": "Hi, my name is Adam Przepiórkowski and this talk is about the Dependency Structure of Coordination.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "1 Hallo, mein Name ist Adam Schipkowski, und dieses Vortrag ist über die Abhängigkeitsstruktur der Koordination.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_206.wav", "doc_id": "SLpqvupgvW.seg_206", "src_text": "Results with T5 XL model are summarized below.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ergebnisse mit dem großen Modell T5.x werden zusammengefasst.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_408.wav", "doc_id": "WBLMIsdIrq.seg_408", "src_text": "And similarly, we find that certain languages also require context when we want to choose the appropriate verb form.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "dass bestimmte Sprachen auch Kontexte erfordern, wenn wir die richtige Verbform wählen wollen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_411.wav", "doc_id": "WBLMIsdIrq.seg_411", "src_text": "And similarly, we find that context is important to translate in the right formality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und so finden wir, dass der Kontext die Rechtsform bestimmt. Und schließlich werden", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_20.wav", "doc_id": "aQpIWggfCo.seg_20", "src_text": "Previous studies have shown that the output quality of language models falls in high variance, leading to bad performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zuvor wurden Studien durchgeführt, die gezeigt haben, dass die Ausgabegüte von LLMs in hohen Variablen fällt, was zu schlechter Leistung führt.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_218.wav", "doc_id": "oYCKgTzTDy.seg_218", "src_text": "And Cross-Lingual Semantic Parsing is the task to translate queries in multiple natural languages into multiple meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Krosssprachige semantische Analyse ist die Aufgabe, Anfragen in mehrere Sprachen zu übersetzen. Naturkommunikationssysteme müssen Fragen in mehrere natürliche Sprachen übersetzen,", "score": 58.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_612.wav", "doc_id": "oeooqChmKK.seg_612", "src_text": "First, we have the typical setting: \"Background-Pretrain\", where background knowledge is assumed to be available at pretrain time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zunächst haben wir die typische Einstellung „Hintergrund vorbereiten“, bei der die Hintergrundinformationen zu Beginn des Trainings zur Verfügung stehen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_331.wav", "doc_id": "dJGfOSFgZO.seg_331", "src_text": "On the other hand, the combination of all turn-level Likert metrics explains far less of the quality, and fewer of these metrics carry unique information.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "der anderen Seite erklärt die Kombination von Likert-Skalen mit unterschiedlichen Niveaus die Qualität nur unzureichend, und einige dieser Skalen enthalten einzigartige Informationen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_175.wav", "doc_id": "SLpqvupgvW.seg_175", "src_text": "Our data set collection methodology emphasizes informality using a cartoon completion setup.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unsere Datensatzsammlungsmethodik betont Informalität, indem sie eine Kartonsammlung verwendet.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_77.wav", "doc_id": "TVCREhgqUP.seg_77", "src_text": "Then we jump to the next multiset token, to determine the second token in the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dann springen wir zum nächsten Multiset-Token, um den zweiten Token in der Ausgabe zu bestimmen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_377.wav", "doc_id": "gGbuDbHhyc.seg_377", "src_text": "Second, WSL approaches should be compared with few-shot learning baselines, as both work on clean samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zweitens sollten die Ansätze zur WSL mit zukünftigen Lernalgorithmen verglichen werden, z. B. mit Arbeiten auf sauberen Sätzen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_50.wav", "doc_id": "TVCREhgqUP.seg_50", "src_text": "Compositional generalization can be understood as the ability of a learner to handle deeper recursion and unseen compositions of phrases that have been seen individually during training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Kompositionelle Generalisierung kann als die Fähigkeit eines Lernenden verstanden werden, tieferen Rekursionen und unerkannten Kompositionen von Phrasen zu handhaben, die einzeln während der Ausbildung gesehen wurden.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_562.wav", "doc_id": "rISrKoXQCx.seg_562", "src_text": "So we could conduct a controlled experiment by further pretraining language model checkpoints on 6 different partisan corpora separated into news and social media, further divided into their political leaning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "lassen. So führen wir Kontrollexperimente durch, indem wir Sprachmodelle weiter ausbilden, und Kontrollpunkte auf sechs verschiedene Parteien aufteilen, die in Nachrichten und sozialen Medien aufgeteilt werden.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_878.wav", "doc_id": "GvEBWkLmuI.seg_878", "src_text": "Thank you so much for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank fürs Zuhören.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_232.wav", "doc_id": "oYCKgTzTDy.seg_232", "src_text": "And we'll also test Monolingual Model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und wir testen auch ein monolinguales Modell.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_356.wav", "doc_id": "gGbuDbHhyc.seg_356", "src_text": "Second, if clean data is required, or if clean data is mandatory for WSL to work, then how many clean samples do we need?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "saubere Daten erforderlich sind oder wenn saubere Daten zur Validierung erforderlich sind, wie viele saubere Proben benötigen wir? Schließlich", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_246.wav", "doc_id": "oYCKgTzTDy.seg_246", "src_text": "We found that Encoder-Decoder or Encoder-PTR can be improved by training in a mixture of various languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellten fest, dass Encoder-Decoder oder Encoder-PDR verbessert werden können, indem man in einer Mischung aus verschiedenen Sprachen trainiert. Und wenn", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_762.wav", "doc_id": "XejEJmgUmE.seg_762", "src_text": "So why does the match prefix affect the language model judgement so much?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Warum beeinflusst der Match-Prefix also so stark die Sprachmodellurteile?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_374.wav", "doc_id": "gGbuDbHhyc.seg_374", "src_text": "Our concrete recommendations for future work are as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unsere konkreten Empfehlungen für zukünftige Arbeiten sind wie folgt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_273.wav", "doc_id": "PIZEXUFLAR.seg_273", "src_text": "These tasks are derived from 21 existing open-source dataset and each task is equipped with five expert written instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese Aufgaben werden aus 21 bestehenden offenen Quellendatensätzen abgeleitet und jede Aufgabe ist mit fünf zusätzlichen Schreibweisen ausgestattet.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_541.wav", "doc_id": "dvGkKzmIaN.seg_541", "src_text": "The legend of the figures means the number of triggers in each sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Legende der Figuren bedeutet die Anzahl der Trigger in jedem Satz.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_683.wav", "doc_id": "oaOHnMCwad.seg_683", "src_text": "Design biases like the one that we just saw before might occur due to the positionality of the NLP researchers and model developers.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Position der NPL-Forscher und -Entwickler", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_155.wav", "doc_id": "wLqFAuDnKa.seg_155", "src_text": "However, the \"Style/Awkward\" category for PaLM is lower than for the state-of-the-art systems, which is an additional signal that PaLM provides really fluent output, but still with some problems of accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Allerdings ist die Stil-Output-Kategorie für PARM niedriger als für die modernsten Systeme, was ein zusätzliches Signal ist, dass PARM wirklich flüssige Ausgaben liefert, aber immer noch mit einigen Genauigkeitsproblemen.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_743.wav", "doc_id": "XejEJmgUmE.seg_743", "src_text": "So for example, here we have chosen like a typical pair of grammaticality from the BLiMP data set from the Adjunct Island case.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Beispiel haben wir hier typische Paare von Grammatikalität aus der Datenbank von der adjungierten Insel ausgewählt.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_173.wav", "doc_id": "SLpqvupgvW.seg_173", "src_text": "We're not aware of a larger-scale public data set for the task, so we collect one using crowd annotation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "nicht bewusst, dass es sich um einen öffentlichen Datensatz handelt, daher sammeln wir ihn mit Crowdsourcing.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_748.wav", "doc_id": "XejEJmgUmE.seg_748", "src_text": "So that is what we call as the mismatch scenario.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "also ist das das, was wir als Missmatch-Szenario bezeichnen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_711.wav", "doc_id": "oaOHnMCwad.seg_711", "src_text": "We find that Dynahate is also most aligned to English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "handelt und dass es auch die meisten englischsprachigen Länder sind.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_877.wav", "doc_id": "GvEBWkLmuI.seg_877", "src_text": "We just really can't make any assumptions or really study that further, without more transparency.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Sie können wirklich keine Vermutungen anstellen oder das weiter mit mehr Transparenz studieren.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_835.wav", "doc_id": "GvEBWkLmuI.seg_835", "src_text": "So we can ask the model to generate a persona, which is a depiction of an imagined individual using a prompt like \"Imagine you are an Asian woman.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "können also das Modell fragen, wie es eine Persönlichkeit generieren würde, die eine Darstellung eines imaginären Individuums ist, das einen Promtpunkt wie „Stell dir vor, du bist eine asiatische Frau“", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_878.wav", "doc_id": "GvEBWkLmuI.seg_878", "src_text": "Thank you so much for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Vielen Dank für das", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_401.wav", "doc_id": "WBLMIsdIrq.seg_401", "src_text": "We can think of words that have high P-CXMI as ones that require context for translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir können annehmen, dass Wörter mit hohem Cxmi-Wert für die Übersetzung erforderlich", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_94.wav", "doc_id": "uZBWfYjYnf.seg_94", "src_text": "Simultaneous speech translation, or SimulST, is the process of translating spoken language into a text in another language in real time, enabling cross-language communication.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Sprachübersetzung? Simultane Sprachübersetzung oder SimulSRT ist der Prozess der Übertragung einer gesprochenen Sprache in einen Text einer anderen Sprache in Echtzeit, wodurch eine Sprachübersetzung ermöglicht wird. Was", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_562.wav", "doc_id": "rISrKoXQCx.seg_562", "src_text": "So we could conduct a controlled experiment by further pretraining language model checkpoints on 6 different partisan corpora separated into news and social media, further divided into their political leaning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "für den sie vorgesehen sind. Daher werden in einem weiteren Sprachmodellprüfungsprotokoll sechs verschiedene Parteien in Nachrichten und sozialen Medien aufgeteilt. Mit weiteren Sprachmodellen", "score": 52.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_560.wav", "doc_id": "rISrKoXQCx.seg_560", "src_text": "We can also see that GPT-4 is the most liberal language model of them all, and GPT series are generally more socially liberal than BART series and its variants.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir können auch sehen, dass Gpt-Four das freiheitlichste Sprachmodell von allen ist und die Gpt-Theorien im Allgemeinen freier sind als die der Bert-Theorien.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_291.wav", "doc_id": "PIZEXUFLAR.seg_291", "src_text": "We also introduce an additional evaluation metric called sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir führten auch eine zusätzliche Bewertungsmetrik namens Sensitivität ein, um", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_218.wav", "doc_id": "oYCKgTzTDy.seg_218", "src_text": "And Cross-Lingual Semantic Parsing is the task to translate queries in multiple natural languages into multiple meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und Crosslingo samy ist die Aufgabe, Fragen in mehreren natürlichen Sprachen in mehrere Bedeutungsdarstellungen zu übersetzen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_157.wav", "doc_id": "wLqFAuDnKa.seg_157", "src_text": "For more details, please come to the full presentation of the paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "tun, um mehr Einzelheiten zu erfahren, bitte kommen Sie zu der vollständigen Präsentation des Papiers,", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_550.wav", "doc_id": "rISrKoXQCx.seg_550", "src_text": "According to a survey of the C4 Corpus, we can see that New York Times, Los Angeles Times, The Guardian, Huffington Post, etcetera are well covered in language model training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Laut einer Umfrage des C4 Corpus können wir sehen, dass die New York Times, CNN und Fox News die drei am besten vertretenen Nachrichtenmedien sind. Los Angeles Times, The Guardian, Huffington Post etc. sind gut in Sprachmodellierungsdaten abgedeckt. Dies hat eine Mischung von", "score": 28.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_347.wav", "doc_id": "gGbuDbHhyc.seg_347", "src_text": "When compared to human annotations, the weaker annotations are much cheaper, yet they are also noisy, meaning that a certain amount of the annotations are incorrect.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Im Vergleich zu menschlichen Anmerkungen sind die schwachen Anmerkungen viel billiger, aber sie sind auch laut, was bedeutet, dass eine gewisse Menge der Anmerkungen falsch ist.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_195.wav", "doc_id": "SLpqvupgvW.seg_195", "src_text": "When we show this alternative question to the annotators, they know the name of these entities, but they don't necessarily know about the entities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn wir diese alternative Frage den Annotatoren zeigen, wissen sie den Namen dieser Entitäten, aber sie wissen nicht unbedingt, was diese Entitäten sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_849.wav", "doc_id": "GvEBWkLmuI.seg_849", "src_text": "So for instance, the word \"warrior\" is usually associated with men.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel ist das Wort \"Mann\" oder das Wort \"Krieger\" normalerweise mit Männern verbunden,", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_69.wav", "doc_id": "TVCREhgqUP.seg_69", "src_text": "First, we tag each input token with an unordered multiset of tokens that will appear in the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "kennzeichnen wir jedes Eingabetoken mit einem unsortierten Mehrfachset von Tokens, die im Ausgang erscheinen werden.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_171.wav", "doc_id": "SLpqvupgvW.seg_171", "src_text": "Here are some examples of indirect references for example, \"the newer one\" or \"the song that's not energetic.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sind hier einige Beispiele für direkte Verweise, z. B. der neuere oder der nicht energetische Satz.", "score": 54.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_386.wav", "doc_id": "WBLMIsdIrq.seg_386", "src_text": "So a lot of translations depend on context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also hängen viele Übersetzungen von dem Kontext ab:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_677.wav", "doc_id": "oaOHnMCwad.seg_677", "src_text": "So let's start off by imagining that you're working for a newspaper and you're sifting through comments under your news article trying to remove toxic content.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "durchgeführt. Also, beginnen wir damit, dass Sie für eine Zeitung arbeiten und sich durch Kommentare unter Ihrem Artikel arbeiten, um toxische Inhalte zu entfernen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_313.wav", "doc_id": "dJGfOSFgZO.seg_313", "src_text": "The common practice is to use human evaluation, such as by asking human judges to select which of two conversations is better or to rate conversations given a Likert scale.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die übliche Praxis ist, menschliche Bewertungen zu verwenden, z. B. indem man Menschen fragt, welche von zwei Konversationen besser ist, oder indem man Konversationen anhand eines Likert-Skalens bewertet.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_38.wav", "doc_id": "aQpIWggfCo.seg_38", "src_text": "We find CoScript shows high pluralism in the generated specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir stellen fest, dass CoScript bei den erzeugten spezifischen Zielen eine hohe Hypothese zeigt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_359.wav", "doc_id": "gGbuDbHhyc.seg_359", "src_text": "First, we find that, interestingly, recent WSL methods indeed require clean validation samples to work properly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zunächst stellen wir fest, dass die neueren WSL-Methoden tatsächlich saubere Validierungssamples benötigen, um ordnungsgemäß zu funktionieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_59.wav", "doc_id": "TVCREhgqUP.seg_59", "src_text": "In particular, they often fail to reproduce the systematic correspondences between input and output, such as those that are color-coded in the example.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Insbesondere produzieren sie oft Ausgaben, die die systematischen Korrespondenzen zwischen Eingabe und Ausgabe nicht wiederholen, wie die in dem Beispiel farbgekodierten. Ein beliebter", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_754.wav", "doc_id": "XejEJmgUmE.seg_754", "src_text": "So first, we look at the Wikipedia sentences, which are completely irrelevant to the current query pair, and there we find that the MPP judgments are mostly robust for arbitrary context length.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "dem Modell aus? Zuerst schauen wir uns die Wikipedia-Sätze an, die für das aktuelle Suchpaar vollständig irrelevant sind, und dort stellen wir fest, dass die MP3-Beurteilungen für die willkürlichen Kontextlinien ziemlich robust sind.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_147.wav", "doc_id": "wLqFAuDnKa.seg_147", "src_text": "The dev data is much more curated, and with higher quality than the training data, that it's more noisy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Daten sind viel genauer, und mit hoher Qualität sind die Trainingsdaten und die Ergebnisse so, dass sie eine bessere", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_705.wav", "doc_id": "oaOHnMCwad.seg_705", "src_text": "We then compared these annotations with Dynahate, Perspective API, Rewire API, Hate Roberta and GPT 4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "mit diakritischen Zeichen, Perspektiven, Perspektiven, Perspektiven, Perspektiven, Perspektiven, Perspektiven, Perspektiven, Perspektiven, Perspektiven,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_216.wav", "doc_id": "oYCKgTzTDy.seg_216", "src_text": "Today I'm going to present our work \"XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Heute werde ich meine Arbeit vorstellen, Exemplarische Kreuzsprachige Semantische Analyse in mehreren natürlichen Sprachen und vielen Darstellungen. Daher", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_416.wav", "doc_id": "WBLMIsdIrq.seg_416", "src_text": "And we called our tagger the Multilingual Discourse-Aware, or MuDA tagger.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und nennen ihn den multilingualen Diskurs-Tag. Es kann auch", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_564.wav", "doc_id": "rISrKoXQCx.seg_564", "src_text": "For example, for RoBERTa further trained on the left-leaning Reddit corpus we can see a substantial liberal shift in terms of its political biases.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "können wir für Robert, der weiterführend ist, weiterführend auf der linken Seite des Korpus trainiert wird, eine signifikante Liberalisierung in seinen Begriffen sehen. In Bezug auf seine politischen Vorurteile.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_842.wav", "doc_id": "GvEBWkLmuI.seg_842", "src_text": "To capture these patterns, our method has two parts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "diese Muster zu erfassen, besteht unsere Methode aus zwei Teilen,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_580.wav", "doc_id": "rISrKoXQCx.seg_580", "src_text": "We would also like to highlight that we expose the unique dilemma regarding language model political biases.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir möchten auch darauf hinweisen, dass wir die einzigartigen Dilemmata hinsichtlich Sprachmodell-politischer Voreingenommenheiten aufdecken, wie z.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_681.wav", "doc_id": "oaOHnMCwad.seg_681", "src_text": "Where prospective AP is really not as sensitive to offensive terms that are more common in Indian contexts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "unsere Perspektive APs wirklich nicht sensible zu offensiven Begriffen sind und häufiger in indischen Kontexten vorkommen.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_507.wav", "doc_id": "dvGkKzmIaN.seg_507", "src_text": "For example, OpenAI offers a GPT based embedding API.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Beispielsweise bietet OpenI eine gpdb-basierte Embedding-API an.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_643.wav", "doc_id": "FLkGnzVRew.seg_643", "src_text": "Studying dissonance expressed in language can also be beneficial in understanding extremism and polarization of vulnerable groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Untersuchung der Diskrepanz, die in der Sprache ausgedrückt wird, kann auch bei der Verständigung von Extremismus und Polarisierung von vulnerablen Gruppen nützlich sein.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_473.wav", "doc_id": "SUkmfOTvGi.seg_473", "src_text": "We then fine-tuned over 20 models on CoNLL-2003.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "versehen wurde. Dann feinabstimmten wir über zwanzig Modelle auf dem Corolla aus dem Jahr 1993.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_706.wav", "doc_id": "oaOHnMCwad.seg_706", "src_text": "Our study in the end amassed over 16,000 annotations from over 1000 annotators from 87 countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Perspektiven, Perspektiven, Perspektiven, Perspektiven, Perspektiven, Perspektiven, Perspektiven. Jetzt", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_255.wav", "doc_id": "oYCKgTzTDy.seg_255", "src_text": "For example, Encoder-Decoder outperforms previous work or achieves comparable results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "finden, zum Beispiel, dass Encoder-Decoder-Modelle für die Verarbeitung der englischen Muttersprache", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_115.wav", "doc_id": "uZBWfYjYnf.seg_115", "src_text": "And we compare also with the state-of-the-art architecture specifically tailored for simultaneous pre-translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und wir vergleichen sie auch mit der neuesten Architektur, die speziell für simultane Übersetzung optimiert wurde.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_391.wav", "doc_id": "WBLMIsdIrq.seg_391", "src_text": "However, evaluating how well models can translate cases like this is pretty hard.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Es ist jedoch ziemlich schwierig, zu bewerten, wie gut Modelle solche Fälle kontrastieren. Zunächst", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_85.wav", "doc_id": "TVCREhgqUP.seg_85", "src_text": "As a consequence, for a given token we don't know which multiset it came from, which poses a challenge for training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Als Konsequenz wissen wir für ein gegebenes Token nicht, welchen Multisetter es stammt, was eine Herausforderung für das Training darstellt.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_586.wav", "doc_id": "rISrKoXQCx.seg_586", "src_text": "Ok, great.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Okay, großartig,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_250.wav", "doc_id": "oYCKgTzTDy.seg_250", "src_text": "In this figure, the blue line is Cross-lingual Few-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In diesem Diagramm ist die blaue Linie die Übersetzung zwischen Sprachen mit wenigen Schüssen, die", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_487.wav", "doc_id": "SUkmfOTvGi.seg_487", "src_text": "For data overfitting, we saw that from the graph on the right, the red best fit line has a gradient that is greater than one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Bei der Anpassung der Überlänge sahen wir von der Grafik auf der rechten Seite, dass die rote Linie, die am besten passt, einen Abstand von mehr als einem hat.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_65.wav", "doc_id": "TVCREhgqUP.seg_65", "src_text": "Obtaining trees may also involve specialized grammar-induction procedures.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "können durchgeführt werden.", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_136.wav", "doc_id": "wLqFAuDnKa.seg_136", "src_text": "And this can go, in extreme cases, up to 40 BLEURT points.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und das kann in extremen Fällen bis zu vierzig Punkte", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_31.wav", "doc_id": "aQpIWggfCo.seg_31", "src_text": "Creating the dataset is an essential step to this end.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Erstellung eines Datensatzes ist ein wesentlicher Schritt zu seinem Ende.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_832.wav", "doc_id": "GvEBWkLmuI.seg_832", "src_text": "They usually rely on hand-constructed data sets that are very time-consuming to curate and they also usually only. measure very specific stereotypes, meaning that they don't generalize well to other demographics or contexts, or they simply capture very general broad associations, like negative associations with particular groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "stützen sich in der Regel auf handgefertigte Datensätze, die sehr zeitaufwändig zu erstellen sind. Und sie messen üblicherweise nur sehr spezifische Stereotypen, was bedeutet, dass sie nicht allgemein auf andere Demografien oder Kontexte angewendet werden können, oder sie fangen sehr allgemeine, negative Assoziationen mit bestimmten Gruppen", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_306.wav", "doc_id": "PIZEXUFLAR.seg_306", "src_text": "So this is a QR code for our data and model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies ist ein QR-Code für unsere Daten und das Modell.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_336.wav", "doc_id": "dJGfOSFgZO.seg_336", "src_text": "With the rapid pace of improvement in the field, many of these error rates could see a decrease in new models released since our evaluation was conducted.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Mit dem schnellen Fortschritt im Bereich könnten viele dieser Fehlerquoten in neuen Modellen, die seit unserer Bewertung veröffentlicht wurden, abnehmen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_674.wav", "doc_id": "oaOHnMCwad.seg_674", "src_text": "Hi everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ich bin", "score": 1.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_208.wav", "doc_id": "SLpqvupgvW.seg_208", "src_text": "But this is not realistic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Prozent. Aber das ist nicht realistisch.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_80.wav", "doc_id": "TVCREhgqUP.seg_80", "src_text": "To give you a teaser of the experimental results, here we compare our method with other treeless models on the COGS benchmark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um Ihnen ein Beispiel für die experimentellen Ergebnisse zu geben, vergleichen wir hier unsere Methode mit anderen treelose Modellen auf dem Koggs-Benchmark;", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_574.wav", "doc_id": "rISrKoXQCx.seg_574", "src_text": "Similar trends also happen for fake news detection, where we see that left-leaning language models are better at detecting misinformation from their opposite political leaning and vice versa.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ähnliche Trends haben sich auch für die Erkennung von Falschmeldungen ergeben, bei denen wir sehen, dass lebende Sprachmodelle besser darin sind, Informationen von ihren gegnerischen politischen Positionen zu erkennen. In diesem Beitrag werden", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_878.wav", "doc_id": "GvEBWkLmuI.seg_878", "src_text": "Thank you so much for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank, dass Sie", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_156.wav", "doc_id": "wLqFAuDnKa.seg_156", "src_text": "And that's it for this really short overview.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und das ist für diese wirklich kurze Sicht.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_42.wav", "doc_id": "aQpIWggfCo.seg_42", "src_text": "We evaluate constrained language planning ability of large language models and develop an over-generate-then-filter method for large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "haben, die Sprachplanungsfähigkeit von Großsprachenmodellen bewertet haben und eine übertragende Filtermethode für Großsprachenmodell entwickelt haben.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_322.wav", "doc_id": "dJGfOSFgZO.seg_322", "src_text": "For example, ABC-Eval measures the number of turns in which a chat model ignores its partner or says something irrelevant, contradicts itself or its partner, hallucinates incorrect facts or violates common sense knowledge, and when the model succeeds or fails to show empathy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel misst ABC-EVAL die Anzahl der Drehungen, in denen ein Chatmodell seinen Partner ignoriert oder etwas Irrelevantes sagt. Widerspricht sich selbst oder seinem Partner. Halluziniert falsche Fakten oder verletzt das allgemeine Wissen. Und wenn das Modell Erfolg oder Misserfolg bei der Darstellung von Empathie hat.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_303.wav", "doc_id": "PIZEXUFLAR.seg_303", "src_text": "So overall, we propose the first large scale multi-model instruction tuning dataset with significantly improved their short capability of OFA, and we explore different transfer learning technique and show their benefits.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Insgesamt schlagen wir ein erstes groß angelegtes Multimodel-Tuning-Datenset vor, das die Fähigkeiten von OIF deutlich verbessert, und wir untersuchen verschiedene Transferlerntechniken und zeigen deren Vorteile auf.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_318.wav", "doc_id": "dJGfOSFgZO.seg_318", "src_text": "Our approach attempts to reduce the subjectivity of human evaluation by explicitly annotating whether or not each model response expresses certain behaviors, such as responding with irrelevant information or contradicting itself.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unsere Herangehensweise versucht, die Subjektivität der menschlichen Bewertung zu reduzieren, indem explizit angegeben wird, ob oder nicht jede Modellantwort bestimmte Verhaltensweisen ausdrückt, wie z. B. das Antworten auf irrelevanten Informationen oder das Widersprechen seines eigenen Verhaltens.", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_97.wav", "doc_id": "uZBWfYjYnf.seg_97", "src_text": "Long and complicated training procedures, for example, training involving different optimization objectives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und komplexe Trainingsverfahren, z. B. das Training unter verschiedenen Optimierungszielen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_509.wav", "doc_id": "dvGkKzmIaN.seg_509", "src_text": "Therefore, it's necessary to protect the copyright of embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Daher ist es notwendig, die Urheberrechte von Einbettungen und Diensten zu schützen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_260.wav", "doc_id": "oYCKgTzTDy.seg_260", "src_text": "And et cetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und so weiter,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_254.wav", "doc_id": "oYCKgTzTDy.seg_254", "src_text": "We also find some other interesting findings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir finden auch einige andere interessante Ergebnisse,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_519.wav", "doc_id": "dvGkKzmIaN.seg_519", "src_text": "Then let me introduce the details of our embedding marker.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Lassen Sie mich nun die Details unseres Embedding-Markers vorstellen:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_495.wav", "doc_id": "SUkmfOTvGi.seg_495", "src_text": "So going back to the question that we raised in the title of our paper Do CoNLL-2003 taggers still work in 2023?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wurde. Zurück zur Frage, die wir in der Überschrift unseres Papiers gestellt haben: „Funktionieren die Cylon-Zeta-Tags noch im Jahr 2000?“ Und", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_865.wav", "doc_id": "GvEBWkLmuI.seg_865", "src_text": "Furthermore, there's a lot of common tropes that are reflected in these words, especially for women of color.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "über hinaus gibt es viele Gemeinsamkeiten, die in diesen Wörtern reflektiert werden, insbesondere für Frauen", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_9.wav", "doc_id": "aQpIWggfCo.seg_9", "src_text": "A good planner should write scripts that are reasonable and faithful to constraints.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ein guter Planer sollte Skripte schreiben, die vernünftig und konform sind.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_490.wav", "doc_id": "SUkmfOTvGi.seg_490", "src_text": "So what about temporal drift then?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was ist dann mit der Temperatur? Für", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_355.wav", "doc_id": "gGbuDbHhyc.seg_355", "src_text": "First, is clean validation data necessary for WSL or can we maybe use a noisy validation set instead?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "zu stellen: Erstens, sind saubere Validierungsdaten für WSL notwendig, oder können wir stattdessen ein lautes Validierungsset verwenden? Zweitens, wenn", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_66.wav", "doc_id": "TVCREhgqUP.seg_66", "src_text": "In this paper, we don't use trees and introduce a neural seq2seq model that directly models the correspondences between fragments of the input and fragments of the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In dieser Arbeit verwenden wir keine Bäume und stellen ein neuronales Sequenz-zu-Sequenz-Modell vor, das die Korrespondenzen zwischen Fragmenten des Eingabedaten und Fragmenten des Ausgabedaten direkt modelliert.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_643.wav", "doc_id": "FLkGnzVRew.seg_643", "src_text": "Studying dissonance expressed in language can also be beneficial in understanding extremism and polarization of vulnerable groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das Erlernen von Ausspracheunterschieden kann auch nützlich sein, um Extremismus und Polarisierung von Gruppen zu verstehen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_286.wav", "doc_id": "PIZEXUFLAR.seg_286", "src_text": "Each instance is randomly combined with one of its five instruction templates.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wobei jede Instanz zufällig mit einem seiner fünf Anweisungsschemata kombiniert", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_809.wav", "doc_id": "WTTtiRKFZI.seg_809", "src_text": "So, \"salt and pepper\" and not \"pepper and salt\", measured in syllables.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sein, also Salz und Pfeffer und nicht Pfeffer und Salz, gemessen in Silben.", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_231.wav", "doc_id": "oYCKgTzTDy.seg_231", "src_text": "And for example, we train the English model on English query and during inference we translate the German query using API to English and then use the trained model to predict the SQL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Beispielsweise trainieren wir das englische Modell auf der Grundlage der englischen Anfrage und während der Inferenz übersetzen wir die deutsche Anfrage mithilfe der API ins Englische und verwenden dann das trainierte Modell, um die Fortsetzung vorherzusagen.", "score": 52.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_434.wav", "doc_id": "hgIDlKNiFM.seg_434", "src_text": "We introduce the first biomedical model in French named DrBERT, which is based on RoBERTa and trained on NACHOS, which is a data set of medical crawled data from the web.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir führen das erste biomedizinische Modell in Frankreich ein, das den Namen „Doctor Bert“ trägt und auf Roberta basiert, und trainieren es auf NACHOS, einem Datensatz medizinischer Daten aus dem Internet.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_12.wav", "doc_id": "aQpIWggfCo.seg_12", "src_text": "As shown in the table, we extend the abstract goals with multi-faceted constraints for human-in-the-loop data acquisition using InstructGPT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und wie in der Tabelle dargestellt, erweitern wir die abstrakten Ziele mit mehrphasigen Einschränkungen für die Mensch-Luk-Datenerfassung mit Instr. Gpt.", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_653.wav", "doc_id": "FLkGnzVRew.seg_653", "src_text": "Since the initial model was not able to capture the dissonance class at all, we start the active learning process by transferring weights from closely related tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Da das ursprüngliche Modell nicht in der Lage war, die Dissensklasse überhaupt zu erfassen, starten wir den aktiven Lernvorgang, indem wir die Gewichte des Modells übertragen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_555.wav", "doc_id": "rISrKoXQCx.seg_555", "src_text": "Secondly, how do language models with different political leanings actually perform on downstream tasks and whether that might result in fairness issues in NLP applications?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wie werden Sprachmodelle mit unterschiedlichen politischen Linien tatsächlich auf Downstream-Tasks ausgeführt, und ob das mich in der Verwendung von NL-P-Anwendungen zu Unfairness führen könnte? Daher", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_378.wav", "doc_id": "gGbuDbHhyc.seg_378", "src_text": "Third, continuous fine-tuning is a simple yet strong baseline that should be considered in future work in WSL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Drittens ist die kontinuierliche Feinabstimmung ein einfacher, aber starker Baseline, der in zukünftigen Arbeiten zur WSL berücksichtigt werden sollte.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_427.wav", "doc_id": "WBLMIsdIrq.seg_427", "src_text": "We also compared different commercial systems and our benchmark shows that DeepL is usually more accurate than Google Translate for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir vergleichen auch verschiedene kommerzielle Systeme und unsere Benchmark-Marken zeigen, dass DVB für die lokale Dokumentenübertragung normalerweise genauer ist als Google-Übersetzung.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_380.wav", "doc_id": "gGbuDbHhyc.seg_380", "src_text": "You can find it via the QR code on this slide.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Sie können ihn über den QR-Code auf dieser Folie", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_76.wav", "doc_id": "TVCREhgqUP.seg_76", "src_text": "For the first output position, we simply select one, as highlighted in red.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "werden soll; für die erste Ausgangsposition wählen wir einfach einen „hervorgehobenen“ roten Token.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_746.wav", "doc_id": "XejEJmgUmE.seg_746", "src_text": "So we can do the same thing by choosing unacceptable sentences from the same matching, and that could also be used to test the models acceptability.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir können dasselbe tun, indem wir unakzeptable Sätze aus der gleichen Abstimmung auswählen, und das könnte auch zum Testen der Akzeptanz des Modells verwendet werden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_452.wav", "doc_id": "hgIDlKNiFM.seg_452", "src_text": "These models are compared to six baseline models which are CamemBERT OSCAR 138 GB, CamemBERT OSCAR 4 GB, CamemBERT CCNET 4 GB, PubMedBERT, BioBERT, and ClinicalBERT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Modelle sind mit sechs Bildschirmgrößenmodellen vergleichbar, die folgende sind: Camembert Oscar 1.038 GB, Camembert Oscar 4 GB, Camembert CINet 4 GB, Camembert Pummet, Camembert Biober und Camembert Clinber.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_527.wav", "doc_id": "dvGkKzmIaN.seg_527", "src_text": "The provided embedding is a weight summation of the target embedding and the original embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Das bereitgestellte Embedding ist eine gewichtete Summe des Ziel-Embeddings und des ursprünglichen Embeddings.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_845.wav", "doc_id": "GvEBWkLmuI.seg_845", "src_text": "And also this enables direct comparison between our generated personas and the human written responses.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Subjekte zu erschaffen, die auch Rassenstereotypen aufwiesen. Und auch dies ermöglicht eine direkte Vergleichsweise zwischen unseren generierten Personen und den menschlichen Reaktionen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_473.wav", "doc_id": "SUkmfOTvGi.seg_473", "src_text": "We then fine-tuned over 20 models on CoNLL-2003.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben dann über 20 Modelle auf Kornel 2003 feinjustiert.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_176.wav", "doc_id": "SLpqvupgvW.seg_176", "src_text": "The cartoon has three speech bubbles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der Cartoon hat drei Sprechblasen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_809.wav", "doc_id": "WTTtiRKFZI.seg_809", "src_text": "So, \"salt and pepper\" and not \"pepper and salt\", measured in syllables.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "als die rechten Konjunktionen.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_367.wav", "doc_id": "gGbuDbHhyc.seg_367", "src_text": "As we can see, if we have 10 samples per class, direct fine-tuning starts to beat WSL approaches.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "können sehen, wenn wir zehn Beispiele pro Klasse haben, beginnt die direkte Feinabstimmung, die WSL-Ansätze zu überbieten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_833.wav", "doc_id": "GvEBWkLmuI.seg_833", "src_text": "Furthermore, most work in this space doesn't account for intersectionality, which is the notion that multi-faceted social identities can compound biases and be unique loci of harm.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ein. Zusätzlich zu den meisten Arbeiten in diesem Bereich geht es um Intersektionalität, was die Vorstellung ist, dass mehrere soziale Identitäten zusammengesetzt sein können.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_687.wav", "doc_id": "oaOHnMCwad.seg_687", "src_text": "And so one question that people might ask is, do datasets and models have positionality?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und eine Frage, die die Leute stellen könnten, lautet: Haben Datensätze Modelle Positionalität?", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_129.wav", "doc_id": "wLqFAuDnKa.seg_129", "src_text": "This involves using the latest test sets to avoid an overlap of the test data with the training data of the language model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies beinhaltet die Verwendung der neuesten Testsets, um eine Überlappung der Testdaten mit den Trainingsdaten der Sprachmodelle zu vermeiden.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_610.wav", "doc_id": "oeooqChmKK.seg_610", "src_text": "We vary the availability of these two pieces of information such that it may either be found in a single source, or in multiple sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Verfügbarkeit dieser beiden Informationsstücke ist zu prüfen, da sie entweder in einer einzigen oder in mehreren Quellen gefunden werden können.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_591.wav", "doc_id": "oeooqChmKK.seg_591", "src_text": "Natural language understanding models draw on a variety of knowledge sources, such as knowledge contained in their parameters, usually acquired by a pretraining, and knowledge given in inputs at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Natürliche Sprachverstehensmodelle greifen auf eine Vielzahl von Wissensquellen zurück, wie z. B. das Wissen, das in den Parametern enthalten ist, das normalerweise durch Vorbereiten und Wissen, das in den Eingaben enthalten ist, erworben wird.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_139.wav", "doc_id": "wLqFAuDnKa.seg_139", "src_text": "So in this example here, where we perform translation from German into English, the German sentences, the source sentences, are marked with German colon and the English translations with English colon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Beispiel, wenn wir eine Übersetzung von Deutsch ins Englisch vornehmen, werden die deutschen Sätze mit einem deutschen Kolon und die englischen Übersetzungen mit einem englischen Kolon markiert.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_818.wav", "doc_id": "WTTtiRKFZI.seg_818", "src_text": "In such cases, the left conjunct prefers to be shorter; the most of the biggest difference between the two conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In solchen Fällen bevorzugt der linke Gouverneur, dass die Konjunktion kürzer ist als die Konjunktion.'Die größere Differenz zwischen den beiden Konsonanten ist", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_411.wav", "doc_id": "WBLMIsdIrq.seg_411", "src_text": "And similarly, we find that context is important to translate in the right formality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und ähnlich finden wir, dass der Kontext unterstützt, um die richtige Formulierung zu übersetzen.", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_143.wav", "doc_id": "wLqFAuDnKa.seg_143", "src_text": "It's the examples that carry most of the weight.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sind die Beispiele, die den Großteil", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_52.wav", "doc_id": "TVCREhgqUP.seg_52", "src_text": "As usual, we have a training set of utterances.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "es so aussehen, dass wir in diesem Fall die", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_833.wav", "doc_id": "GvEBWkLmuI.seg_833", "src_text": "Furthermore, most work in this space doesn't account for intersectionality, which is the notion that multi-faceted social identities can compound biases and be unique loci of harm.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Darüber hinaus berücksichtigen die meisten Arbeiten auf der Welt die Intersektionalität nicht, die Vorstellung, dass multi-facettierte soziale Identitäten durch Biaise und können eine einheitliche Seite des Leidens", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_362.wav", "doc_id": "gGbuDbHhyc.seg_362", "src_text": "This indicates that WSL approaches actually require cleanly labeled data to work properly, and the annotation cost for obtaining clean validation samples should not be overlooked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies deutet darauf hin, dass WSL-Ansätze tatsächlich sauber gekennzeichnete Daten benötigen, um richtig zu funktionieren, und dass die Anmerkungskosten für saubere Validierungsmuster nicht vernachlässigt werden dürfen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_632.wav", "doc_id": "FLkGnzVRew.seg_632", "src_text": "Hello, my name is Vasudha and I'm a Computer Science PhD candidate at Stony Brook University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo, mein Name ist Vasudha und ich bin Doktorand für Informatik an der Stony Brook University.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_9.wav", "doc_id": "aQpIWggfCo.seg_9", "src_text": "A good planner should write scripts that are reasonable and faithful to constraints.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ein guter Planer sollte Skripte schreiben, die vernünftig und den Einschränkungen treu sind.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_714.wav", "doc_id": "oaOHnMCwad.seg_714", "src_text": "However, when models and data sets are aligned to specific populations, some are inevitably left behind.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn jedoch Modelle und Datensätze spezifischen Bevölkerungsgruppen zugewiesen werden, sind einige unvermeidlich hinter den anderen zurückgelassen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_313.wav", "doc_id": "dJGfOSFgZO.seg_313", "src_text": "The common practice is to use human evaluation, such as by asking human judges to select which of two conversations is better or to rate conversations given a Likert scale.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der Common-Practice-Prozess ist eine menschliche Beurteilung, z. B. von menschlichen Richtern, um zu bestimmen, welche der beiden Konversationen besser ist oder Konversationen zu bewerten.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_150.wav", "doc_id": "wLqFAuDnKa.seg_150", "src_text": "But, PaLM comes pretty close to a commercial system.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "aber Palm ist in unserem Fall ziemlich nah", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_584.wav", "doc_id": "rISrKoXQCx.seg_584", "src_text": "And it's incredibly hard to determine what is actually neutral and should be retaining language monitoring data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "es ist unglaublich schwer, zu bestimmen, was tatsächlich neutral ist und welche Sprachdaten wir zurückbehalten sollten.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_595.wav", "doc_id": "oeooqChmKK.seg_595", "src_text": "Pretrained parameters can contain information about what presidents do and what a TV is but they cannot reliably know who this instance-specific entity \"John\" is, or who the new president is, because the president might have changed since pretraining.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vorbereitungsparameter können Informationen über das, was Präsidenten tun, und was eine ATLEI ist, enthalten, aber sie können nicht zuverlässig wissen, wer diese im Augenblick spezifische Einheit, John, ist, oder wer der neue Präsident ist, weil der Präsident sich vielleicht seit seiner Vorbereitungskurse geändert hat. Daher", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_481.wav", "doc_id": "SUkmfOTvGi.seg_481", "src_text": "We found that usually larger models lead to better generalization.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir haben festgestellt, dass üblicherweise größere Modelle zu einer besseren Generalisierung führen.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_663.wav", "doc_id": "FLkGnzVRew.seg_663", "src_text": "We find that the proposed PRC strategy works better than other state-of-the-art strategies, although the difference is small.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir stellen fest, dass die vorgeschlagene PRC-Strategie besser funktioniert als andere State-of-the-Art-Strategien, obwohl der Unterschied klein ist,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_363.wav", "doc_id": "gGbuDbHhyc.seg_363", "src_text": "Our second finding is that increasing the number of clean validation samples will help WSL approaches to achieve better performance, as shown in the figure on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unsere zweite Erkenntnis ist, dass die Erhöhung der Anzahl der validierten Beispiele helfen wird, WSL-Ansätze zu besseren Leistungen zu bringen, wie in der linken Abbildung gezeigt: typischerweise", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_470.wav", "doc_id": "SUkmfOTvGi.seg_470", "src_text": "At the same time, if we do observe poor generalization, what causes the performance drop of these models?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Gleichzeitig, wenn wir eine schlechte Generalisierung beobachten, was verursacht den Leistungsverlust dieser Modelle?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_434.wav", "doc_id": "hgIDlKNiFM.seg_434", "src_text": "We introduce the first biomedical model in French named DrBERT, which is based on RoBERTa and trained on NACHOS, which is a data set of medical crawled data from the web.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir stellen das erste biomedizinische Modell in Französisch namens „Dr. Bert“ vor, das auf „Roberta“ basiert und auf „Nacos“ trainiert wurde, einem Datensatz medizinischer Crowd-Daten aus dem Web.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_247.wav", "doc_id": "oYCKgTzTDy.seg_247", "src_text": "We found it is because most of the major natural languages can obtain performance gain, except that English performance drops in seven datasets and only gains in three datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir fanden heraus, dass dies der Fall ist, weil die meisten der wichtigsten natürlichen Sprachen Leistungszuwächse erzielen können, mit Ausnahme des Englischen, dessen Leistung abnimmt. in sieben Datensätzen und nur in drei Datensätzen. Ich", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_644.wav", "doc_id": "FLkGnzVRew.seg_644", "src_text": "Finally, cognitive dissonance is important to understand personal cognitive styles of individuals and helps us understand decision making processes better.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Schließlich ist es wichtig, persönliche kognitive Stile von Individuen zu verstehen, und hilft uns, Entscheidungsprozesse besser zu verstehen.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_227.wav", "doc_id": "oYCKgTzTDy.seg_227", "src_text": "It contains 9 datasets in various domains, 5 semantic parsing tasks, 8 meaning representations, and 22 natural languages in 15 language families.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch,", "score": 1.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_187.wav", "doc_id": "SLpqvupgvW.seg_187", "src_text": "Where A and B are samples from Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wobei A und B Beispiele aus Wikipedia sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_849.wav", "doc_id": "GvEBWkLmuI.seg_849", "src_text": "So for instance, the word \"warrior\" is usually associated with men.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "So wird zum Beispiel das Wort „Man“ oder „Kriegerin“ normalerweise mit „Man“ oder", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_181.wav", "doc_id": "SLpqvupgvW.seg_181", "src_text": "And in the third speech bubble, Bob uses an indirect reference to select one of these entities, for example, \"the newer one.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Frage und in der dritten Sprechblase verwendet Bob eine indirekte Referenz, um eine dieser Entitäten auszuwählen, zum Beispiel die neuere.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_750.wav", "doc_id": "XejEJmgUmE.seg_750", "src_text": "And we can do the same for unacceptability case.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "können dasselbe für den Fall der Unzulässigkeit tun.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_280.wav", "doc_id": "PIZEXUFLAR.seg_280", "src_text": "So for the training dataset, we use 53 tasks from 9 groups for training and we sample 10,000 instances per task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Für den Trainingsdatensatz verwenden wir drei Aufgaben aus der Gruppe N für die Ausbildung, und zwar zehntausend Instanzen", "score": 51.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_229.wav", "doc_id": "oYCKgTzTDy.seg_229", "src_text": "The first one is Translate-Test.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_592.wav", "doc_id": "oeooqChmKK.seg_592", "src_text": "Recent works in tasks like question answering show that models can use pretrained-time knowledge to solve the task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Arbeiten in Aufgaben wie Fragen beantworten zeigen, dass Modelle Vorwissen verwenden können, um Aufgaben zu lösen.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_227.wav", "doc_id": "oYCKgTzTDy.seg_227", "src_text": "It contains 9 datasets in various domains, 5 semantic parsing tasks, 8 meaning representations, and 22 natural languages in 15 language families.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "neunundneunzig Zellen in verschiedenen Domänen, fünf Symmetrietaufgaben, acht Bedeutungsrepräsentationen und zweiundzwanzig natürliche Sprachen in fünfzehn", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_491.wav", "doc_id": "SUkmfOTvGi.seg_491", "src_text": "For temporal drift, we did an experiment to retrain or continue to pre-train some models with more recent data and we found that the performance degrades with larger temporal gap and this confirms our hypothesis that the main cause of the performance drop is temporal drift.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "zeitliche Verzögerungen führten wir ein Experiment durch, bei dem einige Modelle mit neueren Daten erneut trainiert oder vortrainiert wurden, und stellten fest, dass die Leistung mit größeren zeitlichen Abständen abnimmt. Und dies bestätigt unsere Hypothese, dass die Hauptursache für den Leistungsabfall eine zeitliche Drift ist.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_373.wav", "doc_id": "gGbuDbHhyc.seg_373", "src_text": "Their performance gain and practicality are heavily overestimated.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Leistungsgewinn und ihre Praktikabilität werden stark überschätzt.", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_600.wav", "doc_id": "oeooqChmKK.seg_600", "src_text": "Here is an example from our data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ein Beispiel aus unserem Datensatz:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_64.wav", "doc_id": "TVCREhgqUP.seg_64", "src_text": "Typically, this involves considerable formalism-specific pre-processing of the logical forms, for example, to handle variable symbols.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Verfahren sein. Typischerweise ist dies ein umfangreiches Verfahren. Formalismus-spezifische Vorkompilation der logischen Formen, zum Beispiel zur Behandlung von Variablen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_679.wav", "doc_id": "oaOHnMCwad.seg_679", "src_text": "Where prospective API is able to detect correctly toxic instances.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "in der Lage ist, Toxizitäten korrekt zu erkennen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_500.wav", "doc_id": "dvGkKzmIaN.seg_500", "src_text": "Hello everyone, my name is Jingwei Yi from the University of Science and Technology of China.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hallo alle, mein Name ist Jingwei E von der Universität für Wissenschaft und Technologie in China.", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_268.wav", "doc_id": "PIZEXUFLAR.seg_268", "src_text": "Additionally, at the time of our research, we discovered a considerable discrepancy in the availability of instructional datasets between NLP and multi-modal.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Außerdem haben wir zur Zeit unserer Forschung eine erhebliche Diskrepanz in der Verfügbarkeit von Anweisungsdatensätzen zwischen Lp und Multimodel festgestellt.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_524.wav", "doc_id": "dvGkKzmIaN.seg_524", "src_text": "We assume the provider can collect a general text corpus and count the word frequency with it.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir nehmen an, dass der Anbieter einen allgemeinen Textkörper sammeln kann und die Wortfrequenz mit ihm zählen kann.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_593.wav", "doc_id": "oeooqChmKK.seg_593", "src_text": "But natural language understanding often requires knowledge that is also supplied at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Aber das Verständnis der natürlichen Sprache erfordert oft Wissen, das auch zur Verfügung gestellt", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_861.wav", "doc_id": "GvEBWkLmuI.seg_861", "src_text": "In our analysis, we reveal how these seemingly positive portrayals reflect harmful patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In unserer Analyse werden wir untersuchen, wie diese scheinbar positiven Darstellungen schädliche Muster widerspiegeln.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_240.wav", "doc_id": "oYCKgTzTDy.seg_240", "src_text": "So during training, we train it on English queries or the combination of English and German Few-shot queries to train a multilingual model to predict the SQL output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "also trainieren wir während der Ausbildung auf englischen Anfragen oder der Kombination aus englischen und deutschen Few-Shot-Anfragen, um ein mehrsprachliches Modell zu trainieren.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_666.wav", "doc_id": "FLkGnzVRew.seg_666", "src_text": "We also check the feasibility of each strategy for annotation quality and costs to annotators.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir prüfen auch die Durchführbarkeit jeder Strategie für Annotationsklasse und -kosten und", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_488.wav", "doc_id": "SUkmfOTvGi.seg_488", "src_text": "This means that every unit of improvement that we made, on CoNLL-2003 translates to more than one unit improvement on CoNLL++ which means that there is no diminishing returns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das bedeutet, dass jede Verbesserung, die wir auf Color 2003 vorgenommen haben, sich auf mehr als eine Verbesserung auf Color Plus auswirkt, was bedeutet, dass es keine sinkenden Renditen gibt.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_465.wav", "doc_id": "SUkmfOTvGi.seg_465", "src_text": "Let's get started.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "2023? Lassen Sie", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_183.wav", "doc_id": "SLpqvupgvW.seg_183", "src_text": "The first speech bubble is chosen from a few manual prompts per domain.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der erste Sprachbubble wird aus einigen manuellen Befehlen pro Domäne ausgewählt.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_262.wav", "doc_id": "oYCKgTzTDy.seg_262", "src_text": "Thanks for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "fürs Zuhören.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_562.wav", "doc_id": "rISrKoXQCx.seg_562", "src_text": "So we could conduct a controlled experiment by further pretraining language model checkpoints on 6 different partisan corpora separated into news and social media, further divided into their political leaning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "werden. Daher können wir einen Kontrollversuch durchführen, indem wir mit weiteren Sprachmodellprüfungen auf sechs verschiedenen Parteien und Unternehmen, die in Nachrichten und sozialen Medien getrennt sind, beginnen. Wenn wir darüber hinaus Sprachmodelle vorbereiten und solche", "score": 56.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_784.wav", "doc_id": "WTTtiRKFZI.seg_784", "src_text": "Here loves to all conjuncts separately: Lisa, Bart, and Maggie.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Gouverneur, den sie lieben, zu allen Konjungengetrennt, das heißt, Lisa Barton macht es.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_769.wav", "doc_id": "XejEJmgUmE.seg_769", "src_text": "Please read our paper for more details of our experiments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "nicht vollständig erfassen. Bitte lesen Sie unser Papier für weitere Details zu unseren Experimenten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_147.wav", "doc_id": "wLqFAuDnKa.seg_147", "src_text": "The dev data is much more curated, and with higher quality than the training data, that it's more noisy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Daten sind viel besser qualifiziert und mit einer höheren Qualität, dass die trainierten Daten, die ich sagen, und die Ergebnisse, bessere", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_409.wav", "doc_id": "WBLMIsdIrq.seg_409", "src_text": "We then look at vocabulary items that have high P-CXMI averaged over all of its different occurrences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir schauen uns dann Vokabeln an, die durchschnittlich über alle Vorkommnisse hinweg überdurchschnittlich sind.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_613.wav", "doc_id": "oeooqChmKK.seg_613", "src_text": "Second, there's a \"Background-Both\" setting, where background knowledge is available both at pretrain time and inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zweitens, es gibt einen Backdoor-Setting. Hintergrundwissen ist sowohl vor als auch während der Ausbildung verfügbar.", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_697.wav", "doc_id": "oaOHnMCwad.seg_697", "src_text": "We then take the annotations by demographic and compare them to the models and datasets using a Pearson's R correlation score, and thus our framework actually differs from annotator disagreement literature by comparing end users with models and datasets, predictions and labels, as opposed to looking at just annotator agreement or modelling annotator distributions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dann nehmen wir die Anmerkungen nach demografischen Kriterien und vergleichen sie mit den Modellen und Datensätzen, indem wir die Korrelationskoeffizienten verwenden. Und das unterscheidet sich von der Annotator-Disagreement-Literatur, bei der Endnutzer mit Modellen und Datensätzen verglichen werden, um nur eine Annotator-Disagreement-Verteilung oder Modellierung zu erhalten.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_803.wav", "doc_id": "WTTtiRKFZI.seg_803", "src_text": "So instead of 11, 6 is much shorter.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sein, daher", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_231.wav", "doc_id": "oYCKgTzTDy.seg_231", "src_text": "And for example, we train the English model on English query and during inference we translate the German query using API to English and then use the trained model to predict the SQL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel trainieren wir ein englisches Modell auf einer englischen Anfrage und übersetzen während der Inferenz die deutsche Anfrage mithilfe von API ins Englische und verwenden dann das trainierte Modell, um die Fortsetzung vorherzusagen.", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_551.wav", "doc_id": "rISrKoXQCx.seg_551", "src_text": "This has created a mixed blessing for language model applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies hat eine Mischung aus einer Sprachmodellanwendung erzeugt.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_530.wav", "doc_id": "dvGkKzmIaN.seg_530", "src_text": "Copyright verification is to detect whether a model behind another service contains the word mark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ziel-Embedding. Die Überprüfung der Rechte ist dazu da, herauszufinden, ob ein Modell hinter einem anderen Dienst Inhalte enthält. Das Wasserzeichen.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_704.wav", "doc_id": "oaOHnMCwad.seg_704", "src_text": "We then replicate a very similar setup for the toxicity and hate speech detection task, where they'll read an instance from Dynahate and write whether they think it's instance of hate speech.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "dann eine sehr ähnliche Aufstellung für die Toxizitäts- und Sprachdetektionsaufgabe vornehmen, wobei wir Fälle aus den Bereichen Toxizität und Sprachdetektion berücksichtigen. Wir vergleichen dann diese Anmerkungen", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_755.wav", "doc_id": "XejEJmgUmE.seg_755", "src_text": "We increase the context length toward up to 1024 for to max out OPT and GPT 2 models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben die Kontextlänge auf 1024 erhöht, um die ODP- und GPT-2-Modelle zu maximieren, und", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_819.wav", "doc_id": "WTTtiRKFZI.seg_819", "src_text": "However, when the governor is on the right, as here, \"laughed\" governs the coordination Ted and Ned, this effect disappears.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wird. Wenn jedoch die Regierung auf der rechten Seite, wie sie hier, links, die Koordination Tedenet regiert, tritt dieser Effekt auf. Daher zeigen", "score": 53.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_168.wav", "doc_id": "SLpqvupgvW.seg_168", "src_text": "This could happen when the user cannot remember the name of the song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies könnte passieren, wenn der Benutzer sich nicht an den Namen der Software erinnern kann.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_762.wav", "doc_id": "XejEJmgUmE.seg_762", "src_text": "So why does the match prefix affect the language model judgement so much?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Warum wirkt sich das Match-Präfix so stark auf das Sprachmodellurteil aus?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_470.wav", "doc_id": "SUkmfOTvGi.seg_470", "src_text": "At the same time, if we do observe poor generalization, what causes the performance drop of these models?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Gleichzeitig, wenn wir eine schlechte Generalisierung beobachten, was verursacht den Leistungsabfall dieser Modelle?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_480.wav", "doc_id": "SUkmfOTvGi.seg_480", "src_text": "The second ingredient is the model size.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das zweite Bestandteil ist die Modellgröße.", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_872.wav", "doc_id": "GvEBWkLmuI.seg_872", "src_text": "More broadly, we find that the words for each marked group pretty much just reflect very essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dass die Wörter für jede markierte Gruppe sehr grundlegende Narrative widerspiegeln.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_109.wav", "doc_id": "uZBWfYjYnf.seg_109", "src_text": "If we go on and we receive another speech chunk, and our model predicts other three words and we will look at those cross-attention weights, we will see that no word points to the last lambda speech frames.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn wir fortfahren und einen anderen Sprachkanal erhalten und unser Modell andere drei Wörter vorhersagt, dann werden wir uns diese Cross-Attention-Werte anschauen. Wir werden feststellen, dass keine Wörter auf die letzten Lambeth-Sprachrahmen zeigen.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_511.wav", "doc_id": "dvGkKzmIaN.seg_511", "src_text": "The watermark method need to meet the following properties.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Wasserzeichenmethode muss die folgenden Eigenschaften erfüllen:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_533.wav", "doc_id": "dvGkKzmIaN.seg_533", "src_text": "Then the provider requests the embeddings from the stealer's service with the data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dann fordert der Anbieter von der ähnlichen Dienstleistung die Einbettungen mit dem Datensatz an.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_4.wav", "doc_id": "aQpIWggfCo.seg_4", "src_text": "And show that large language models can effectively decompose goals into steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und gezeigt, dass große Sprachmodelle Ziele effektiv in Schritte zerlegen können.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_856.wav", "doc_id": "GvEBWkLmuI.seg_856", "src_text": "However, when we actually look at the distribution of the words and lexicon, we find very different things.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn wir uns jedoch tatsächlich die Verteilung der Wörter im Lexikon ansehen, finden wir sehr unterschiedliche Dinge. Die generierten Personen", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_806.wav", "doc_id": "WTTtiRKFZI.seg_806", "src_text": "It violates one principle, but it satisfies another one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "es ist ein Prinzip. Okay, also was haben wir", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_489.wav", "doc_id": "SUkmfOTvGi.seg_489", "src_text": "And this shows us that adaptive overfitting in this case is not observed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und dies zeigt uns, dass in diesem Fall keine adaptiven Überlagerungen beobachtet werden.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_815.wav", "doc_id": "WTTtiRKFZI.seg_815", "src_text": "So the governor is on the left in this example \"I saw Bart and Lisa\" so is the governor is on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Gouverneur in diesem Beispiel links – ich habe Bart und Lisa gesehen, also ist der Gouverneur links.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_246.wav", "doc_id": "oYCKgTzTDy.seg_246", "src_text": "We found that Encoder-Decoder or Encoder-PTR can be improved by training in a mixture of various languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir haben festgestellt, dass Encoder-Decoder oder Encoder-PDR durch Training in einer Mischung aus verschiedenen Sprachen verbessert werden können.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_797.wav", "doc_id": "WTTtiRKFZI.seg_797", "src_text": "It's okay the way instead of \"it\", we have this long NP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist okay, wenn anstatt davon zu sprechen, dass ich gestern ein Buch", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_333.wav", "doc_id": "dJGfOSFgZO.seg_333", "src_text": "You can see that in the results of our experiment that several challenges still remain and have been precisely quantified.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In den Ergebnissen unseres Experiments ist zu sehen, dass mehrere Herausforderungen noch bestehen und genau quantifiziert wurden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_867.wav", "doc_id": "GvEBWkLmuI.seg_867", "src_text": "For Asian women, the words are things like \"petite\" and \"delicate\" and \"silky\" which connects to a long history of Asian women being hyper-sexualized, seen as very docile and submissive, and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "asiatische Frauen sind die Worte wie \"petite\", \"delicate\" und \"silkie\". Dies verbindet sich mit einer langen Geschichte asiatischer Frauen, die als hypersexualisiert, sehr dös und submissive usw. gesehen", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_420.wav", "doc_id": "WBLMIsdIrq.seg_420", "src_text": "First of all, when we use corpus-level metrics: so for BLEU, we find that context-agnostic models have the best performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wenn wir Korpus-Metriken verwenden, finden wir, dass komplexe Agnostiker", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_451.wav", "doc_id": "hgIDlKNiFM.seg_451", "src_text": "To evaluate our seven models, we gather data for public and private downstream tasks such as named entity recognition, classification, part-of-speech tagging, and question answering.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um unsere sieben Modelle zu bewerten, haben wir zahlreiche öffentliche und private Donut-Tasks wie Name- und Autorekognition, Klassifizierung, Teilnehmer-Sprachaufnahme und Fragebeantwortung.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_834.wav", "doc_id": "GvEBWkLmuI.seg_834", "src_text": "To overcome these limitations, we rely on the property that these newer instruction-tuned LLMs are very good at responding to instructions and prompts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um diese Einschränkungen zu überwinden, verlassen wir uns auf die Eigenschaft, dass diese neuen, anweisungs-justierten LLMs sehr gut auf Anweisungen und Impulse reagieren. Daher", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_472.wav", "doc_id": "SUkmfOTvGi.seg_472", "src_text": "This is a data set that we collected from Reuters News from 2020, and then annotated them with the same CoNLL-2003 annotation guidelines.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Plus“, der von Reuters News aus dem Jahr „2000“ stammt und dann mit den gleichen Anmerkungsrichtlinien von „Carno Plus“", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_189.wav", "doc_id": "SLpqvupgvW.seg_189", "src_text": "When we move higher in the list, the entities become more similar to each other and it's usually harder to make the disambiguation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wir uns in der Liste nach oben bewegen, werden die Einheiten ähnlicher und es ist normalerweise schwieriger, die Gleichung zu lösen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_814.wav", "doc_id": "WTTtiRKFZI.seg_814", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist der", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_726.wav", "doc_id": "oaOHnMCwad.seg_726", "src_text": "But if you'd like to learn more, feel free to check out our dashboard for the most updated analysis results and our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wenn Sie mehr sehen möchten, können Sie sich die neuesten Ergebnisse und Dokumente ansehen, vielen", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_680.wav", "doc_id": "oaOHnMCwad.seg_680", "src_text": "But that's not really the case for Aditya Sharma.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Aber das ist nicht wirklich der Fall für Aditya Sharma, wo", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_211.wav", "doc_id": "SLpqvupgvW.seg_211", "src_text": "If the language model has access only to entity names, then the accuracy is only 60%, so there's a lot of room for improvement.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn das Sprachmodell nur zu Entitätsnamen Zugriff hat, ist die Genauigkeit nur sechzig Prozent. Es gibt also viel Raum für Verbesserungen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_858.wav", "doc_id": "GvEBWkLmuI.seg_858", "src_text": "So, really just only the positive or at least non-negative ones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Also wirklich nur die positiven oder zumindest nicht negativen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_714.wav", "doc_id": "oaOHnMCwad.seg_714", "src_text": "However, when models and data sets are aligned to specific populations, some are inevitably left behind.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn Modelle und Datensätze jedoch spezifischen Populationen zugeordnet werden, sind einige unweigerlich zurückgeblieben.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_93.wav", "doc_id": "uZBWfYjYnf.seg_93", "src_text": "What is simultaneous speech translation?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was ist Simultandolmetschen", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_329.wav", "doc_id": "dJGfOSFgZO.seg_329", "src_text": "Finally, we checked whether each evaluation metric captures a unique aspect of chat quality using a stepwise linear regression.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Schließlich überprüfen wir, ob die Bewertungsmetriken einen einzigartigen Aspekt der Qualität erfassen, indem wir eine lineare Regression verwenden. Sie können", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_365.wav", "doc_id": "gGbuDbHhyc.seg_365", "src_text": "But that's not the end of the story, because if we either way decide to access clean samples, then training on them directly will even achieve better performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Aber das ist nicht das Ende der Geschichte, denn wenn wir uns auf saubere Proben entscheiden, wird das Training darauf direkt sogar noch bessere Leistungen erzielen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_498.wav", "doc_id": "SUkmfOTvGi.seg_498", "src_text": "And lastly, please make sure to check out our paper, our data set and if you have any questions, feel free to contact me.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und schließlich, bitte stellen Sie sicher, dass Sie unser Papier, unser Datensatz, überprüfen und wenn Sie Fragen haben, zögern Sie nicht, mich zu kontaktieren.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_212.wav", "doc_id": "SLpqvupgvW.seg_212", "src_text": "We've also shown that the models are domain-generalizable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir haben auch gezeigt, dass die Modelle domänenübergreifend sind.", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_599.wav", "doc_id": "oeooqChmKK.seg_599", "src_text": "We evaluate the data set with human study participants and established coreference resolution models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier ist", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_691.wav", "doc_id": "oaOHnMCwad.seg_691", "src_text": "So to study data set and model positionality, we actually compare the annotations with real users with existing datasets and models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um Datensätze und Modellpositionen zu untersuchen, vergleichen wir die Anmerkungen mit realen Benutzern mit vorhandenen Datensätzen und Modellen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_291.wav", "doc_id": "PIZEXUFLAR.seg_291", "src_text": "We also introduce an additional evaluation metric called sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir haben auch eine zusätzliche Bewertungsmaß, die Sensitivität, eingeführt, die", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_433.wav", "doc_id": "hgIDlKNiFM.seg_433", "src_text": "Then we will present the main contribution of our article.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "werden wir den Hauptbeitrag unseres Artikels präsentieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_431.wav", "doc_id": "hgIDlKNiFM.seg_431", "src_text": "Hi, I am Yanis Labrak and I will present you our works on \"DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical Domains.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo, ich bin Yanislavac und werde Ihnen unsere Arbeiten über Dr. Bert vorstellen, ein robustes britisches Modell in französischer Sprache für die biomedizinische und klinische Domäne.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_812.wav", "doc_id": "WTTtiRKFZI.seg_812", "src_text": "So the proportion is bigger of the left short conjunct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und daher ist der Anteil der linken, kürzeren Konjunkturen größer.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_429.wav", "doc_id": "WBLMIsdIrq.seg_429", "src_text": "Thank you so much for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank für die Rückmeldung.", "score": 21.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_243.wav", "doc_id": "oYCKgTzTDy.seg_243", "src_text": "And, we also evaluate Encoder-Decoder models, which is Multilingual Pretrained Encoder-Decoder Models, such as mBART and mT5.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir beurteilen auch Encoder-Decoder-Modelle, die multilinguale vorgeschulte Encoder-Decoder-Modelle sind, wie z.B. M-BART und M-T5.", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_61.wav", "doc_id": "TVCREhgqUP.seg_61", "src_text": "The trees are intended to capture the compositional process that relates utterances with the logical forms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Bäume sollen dazu dienen, den kompositionellen Prozess zu erfassen, der sich auf die logischen Formen bezieht.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_552.wav", "doc_id": "rISrKoXQCx.seg_552", "src_text": "So on one hand, they were able to learn from diverse perspectives, which celebrates democracy and the plurality of ideas.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "auf der einen Seite von verschiedenen Perspektiven lernen, die Demokratie und die Vielfalt von Ideen feiern,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_792.wav", "doc_id": "WTTtiRKFZI.seg_792", "src_text": "However, this effect may be ameliorated when the direct object is very heavy and very long.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dieser Effekt kann jedoch verbessert werden, wenn das direkte Objekt sehr schwer und sehr lang ist,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_56.wav", "doc_id": "TVCREhgqUP.seg_56", "src_text": "In contrast to standard machine learning evaluation, the test set does not come from the same distribution but contains structurally unseen logical forms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Im Gegensatz zur standardisierten maschinellen Auswertung stammt der Test nicht aus derselben Verteilung, sondern enthält strukturell-logische Formen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_622.wav", "doc_id": "oeooqChmKK.seg_622", "src_text": "In this figure, we show the results of the best-performing models on the most difficult variant of the Background-Pretrain setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In dieser Abbildung zeigen wir die Ergebnisse der leistungsfähigsten Modelle auf der schwierigsten Variante der Hintergrundvorbereitung. Beide", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_350.wav", "doc_id": "gGbuDbHhyc.seg_350", "src_text": "In recent works in WSL, so WSL stands for Weakly Supervised Learning, a common claim is that people say that they only train models on the weakly labeled data and achieve high performance on clean test sets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In jüngsten Arbeiten in der WS-Learning-Technologie bedeutet WS-Learning also „wöchentliches Selbstlernen“. Eine verbreitete Behauptung besagt, dass Menschen nur Modelle auf wöchentlichen Datenniveaus trainieren und auf sauberen Testsets hohe Leistungen erzielen.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_422.wav", "doc_id": "WBLMIsdIrq.seg_422", "src_text": "And if we use word f-measure, then models with and without context have comparable performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wenn wir Kommentar-Context-Modelle verwenden, dann haben Modelle mit oder ohne Kontext vergleichbare Leistungen.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_587.wav", "doc_id": "rISrKoXQCx.seg_587", "src_text": "I think that's pretty much all I have for today.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ich denke, das ist ziemlich viel, was ich bis heute habe, danke", "score": 56.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_878.wav", "doc_id": "GvEBWkLmuI.seg_878", "src_text": "Thank you so much for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vielen Dank für Ihre", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_40.wav", "doc_id": "aQpIWggfCo.seg_40", "src_text": "We find that T5 fine-tuned on CoScript can generate scripts of higher quality than most large language models, indicating that smaller models can surpass larger models when properly trained on suitable datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Mit Fan Sites können T-File FanTune auf Courserat Skripte von höherer Qualität als die meisten Sprachmodelle generieren, was zeigt, dass kleinere Modelle größere Modelle unterstützen können, wenn sie auf geeigneten Datensätzen ordnungsgemäß trainiert werden.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_116.wav", "doc_id": "uZBWfYjYnf.seg_116", "src_text": "These are all the results of the simultaneous speech translation strategy on German.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese sind alle Ergebnisse der simultanen Sprachübersetzung auf Deutsch.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_348.wav", "doc_id": "gGbuDbHhyc.seg_348", "src_text": "If we directly train neural networks on weakly labeled data, the neural networks tend to memorize the label noise and do not generalize.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn wir neuronale Netzwerke direkt trainieren und schwach beschriftete Daten verwenden, tendieren die neuronalen Netzwerke dazu, das Labelrauschen zu merken und nicht zu", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_264.wav", "doc_id": "PIZEXUFLAR.seg_264", "src_text": "So with the advances in large language models, many works started to explore new learning paradigms of reusing pre-trained language models for different downstream tasks in a parameter and data-efficient way.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung der Anpassung Daher begannen viele Arbeiten mit den Fortschritten bei großen Sprichmodellen, neue Lernparadigmen zu entdecken, indem sie trainierte Sprichmodelle für unterschiedliche Downstream-Aufgaben in einem Parameter und einer datenorientierten Art und Weise wieder zu verwenden.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_291.wav", "doc_id": "PIZEXUFLAR.seg_291", "src_text": "We also introduce an additional evaluation metric called sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir haben auch eine zusätzliche Evaluationsmetrik namens Sensitivität eingeführt, die", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_656.wav", "doc_id": "FLkGnzVRew.seg_656", "src_text": "Further, on iteratively fine-tuning on both tasks, we find that fine-tuning of CE tasks followed by further fine-tuning on debate yields a much better zero-shot performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "mit AuC-Punkt sechs. Weiterhin ist es beim Feintuning der beiden Aufgaben, dass wir durch weiteres Feintuning in der Debatte feststellen, dass die Leistung des Modells,", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_48.wav", "doc_id": "TVCREhgqUP.seg_48", "src_text": "My name is Matthias Lindemann, and today I'm going to give you a brief introduction to our paper on \"Compositional Generalization without Trees using Multiset Tagging and Latent Permutations\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "mein Name ist Mathias Lindemann, und heute werde ich Ihnen eine kurze Einführung in unser Papier über die Kompositionsgenerierung ohne Bäume geben, die mit Multisets und latenten Permutationen arbeiten.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_846.wav", "doc_id": "GvEBWkLmuI.seg_846", "src_text": "The second part is marked words, which is a method to identify the words that distinguish marked groups from unmarked ones, which I'll elaborate on shortly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "zweite Teil ist „Marked Words“, eine Methode, um die Wörter zu identifizieren, die Markgruppen von Markgruppen unterscheiden.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_207.wav", "doc_id": "SLpqvupgvW.seg_207", "src_text": "If the language model has access to the exact same background knowledge as the annotators, then the accuracy is really high, it's around 92 to 95%.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn das Sprachmodell auf die genau gleiche Hintergrundkenntnis wie die Annotatoren zugreifen kann, ist die Genauigkeit wirklich hoch, sie liegt bei etwa neunundneunzig Prozent,", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_878.wav", "doc_id": "GvEBWkLmuI.seg_878", "src_text": "Thank you so much for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vielen Dank für Ihre", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_37.wav", "doc_id": "aQpIWggfCo.seg_37", "src_text": "This figure shows the constraint distribution of CoScript.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Abbildung zeigt eine konstrizierte Verteilung von CoScript;", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_853.wav", "doc_id": "GvEBWkLmuI.seg_853", "src_text": "So for instance, for the personas of black women, we would do Fightin’ Words and compare the log-odds ratios against both white personas and man personas because those are the two corresponding unmarked groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "So zum Beispiel würden wir für die Persönlichkeiten schwarzer Frauen Wörter kämpfen und die Logos-Raten gegenüber sowohl weißer Persönlichkeit als auch männlicher Persönlichkeit vergleichen, weil es sich dabei um zwei korrespondierende unmarkierte Gruppen handelt.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_736.wav", "doc_id": "XejEJmgUmE.seg_736", "src_text": "And then the hope is that the model, basically, puts more probability to the acceptable sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hoffnung, dass das Modell grundlegend mehr Wahrscheinlichkeit auf den akzeptablen Bereich legt.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_730.wav", "doc_id": "XejEJmgUmE.seg_730", "src_text": "Language model acceptability judgments are not always robust to context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Acceptability Judgments are not always robust to context, begrüßen zu", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_741.wav", "doc_id": "XejEJmgUmE.seg_741", "src_text": "So that is the approach.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Also ist das der Ansatz,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_236.wav", "doc_id": "oYCKgTzTDy.seg_236", "src_text": "For example, we put the German, English, Chinese queries together to train a multilingual model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zum Beispiel setzen wir die deutschen, englischen und chinesischen Suchbegriffe zusammen, um ein mehrsprachiges Modell zu trainieren,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_138.wav", "doc_id": "wLqFAuDnKa.seg_138", "src_text": "In our experiments, we settled for a 5-shot prompting strategy where we just marked each sentence that we provide to the system, with the language it's in.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In unseren Experimenten haben wir uns für eine fünf-Schuss-Strategie entschieden, bei der wir einfach die Sätze, die wir dem System bieten, mit der Sprache markieren, in der sie stehen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_867.wav", "doc_id": "GvEBWkLmuI.seg_867", "src_text": "For Asian women, the words are things like \"petite\" and \"delicate\" and \"silky\" which connects to a long history of Asian women being hyper-sexualized, seen as very docile and submissive, and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "für asiatische Frauen Wörter wie „petit“ und „delicat“ und „silky“. was auf eine lange Geschichte von asiatischen Frauen zurückzuführen ist, die hypersexuell sind, sehr nachgiebig und unterwürfig sind. Und", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_65.wav", "doc_id": "TVCREhgqUP.seg_65", "src_text": "Obtaining trees may also involve specialized grammar-induction procedures.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Gewinnung von Bäumen kann auch spezialisierte Grammatikinduktionsverfahren umfassen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_787.wav", "doc_id": "WTTtiRKFZI.seg_787", "src_text": "The argument is based on the principle of dependency length minimization that I will explain on the basis of these examples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "das Argument basiert auf dem Prinzip der Abhängigkeitslängenminimierung, das wir anhand dieser Beispiele erläutern. Also", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_78.wav", "doc_id": "TVCREhgqUP.seg_78", "src_text": "We determine the third token in the output in a similar way by jumping to another multiset token.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir bestimmen den dritten Token in der Ausgabe auf ähnliche Weise, indem wir zu einem anderen Multiset-Token springen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_128.wav", "doc_id": "wLqFAuDnKa.seg_128", "src_text": "We evaluated the transition capability of such models using the best practices of the MT community.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir bewerten die Übersetzungsvermögen solcher Modelle, indem wir die besten Praktiken der MT-Gemeinschaft verwenden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_584.wav", "doc_id": "rISrKoXQCx.seg_584", "src_text": "And it's incredibly hard to determine what is actually neutral and should be retaining language monitoring data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist unglaublich schwierig, zu bestimmen, was tatsächlich neutral ist und in Sprachdaten behalten werden sollte, also", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_563.wav", "doc_id": "rISrKoXQCx.seg_563", "src_text": "By further pretraining language models on such partisan corpora we can see that the ideological coordinates of the language model also correspondingly shift.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Durch weiteres Üben von Sprachmodellen und Korpora können wir sehen, dass sich auch die ideologische Verbindung des Sprachmodells ändert. Beispielsweise", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_24.wav", "doc_id": "aQpIWggfCo.seg_24", "src_text": "Next, a filter model is developed to select the faithful scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Als nächstes wird ein Filtermodell entwickelt, um die physischen Skripte auszuwählen.", "score": 31.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_809.wav", "doc_id": "WTTtiRKFZI.seg_809", "src_text": "So, \"salt and pepper\" and not \"pepper and salt\", measured in syllables.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "also Salz und Pfeffer, nicht Pfeffer und Salz, und auch. Die", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_656.wav", "doc_id": "FLkGnzVRew.seg_656", "src_text": "Further, on iteratively fine-tuning on both tasks, we find that fine-tuning of CE tasks followed by further fine-tuning on debate yields a much better zero-shot performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Six. Bei weiteren iterativen Feinabstimmungen der beiden Aufgaben finden wir die Feinabstimmung der CE-Aufgaben durch weitere Feinabstimmung auf Debette.", "score": 52.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_388.wav", "doc_id": "WBLMIsdIrq.seg_388", "src_text": "Well, if the previous sentence was \"Things could start to get dangerous if the ministers find out\", then \"mole\" refers to a spy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn der vorherige Satz besagte, dass die Dinge gefährlich werden könnten, wenn die Minister es herausfinden würden, dann bezieht sich Mo auf einen Spion,", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_152.wav", "doc_id": "wLqFAuDnKa.seg_152", "src_text": "The insights that we gained from the human evaluation that we performed using the MQM framework said that the fluency of PaLM is comparable to state-of-the-art systems but the main difference comes from the accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Erkenntnisse, die wir aus der menschlichen Analyse gewinnen, die wir mit dem MQM-Framework durchführen, sind, dass die Flüssigkeit von Palm vergleichbar mit dem aktuellen Stand der Systeme ist, aber der Hauptunterschied kommt von der Genauigkeit. Besonders", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_583.wav", "doc_id": "rISrKoXQCx.seg_583", "src_text": "If we do try to sanitaze somehow, we would also risk censorship, or exclusion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn wir versuchen, die Sensitivität zu senken, riskieren wir auch eine Zensur oder Ausklammerung, und", "score": 58.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_847.wav", "doc_id": "GvEBWkLmuI.seg_847", "src_text": "The benefit of this is that we get really specific stereotypes and patterns, without having to rely on any specific lexicon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der Vorteil dabei ist, dass wir wirklich spezifische Stereotypen und Muster erkennen können, ohne auf irgendein spezifisches Lexikon angewiesen zu sein.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_729.wav", "doc_id": "XejEJmgUmE.seg_729", "src_text": "I'm Koustav Sinha, and I'm pleased to welcome you to our talk of our ACL 2023 paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ich bin Kostas Sena und freue mich, Sie zu unserem Gespräch über die Sprachmodellakzeptabilitätsurteile in unserer Arbeit zu „ACL", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_17.wav", "doc_id": "aQpIWggfCo.seg_17", "src_text": "Results in the figure show that the semantic completeness in generated scripts is acceptable but the faithfulness to the constraints cannot be guaranteed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "in den Tabellen zeigen, dass die semantische Vollständigkeit in generierten Skripten akzeptabel ist, aber die Treue zu den Einschränkungen nicht garantiert werden kann.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_554.wav", "doc_id": "rISrKoXQCx.seg_554", "src_text": "To this end, we propose to investigate the political bias propagation pipeline from pretraining data to language models to downstream tasks, specifically by asking the following questions: First, how do we evaluate the political leaning of language models and what role does pretraining data might have on such political biases?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zu diesem Zweck schlagen wir vor, die Pipeline der politischen Propaganda von der Datenerhebung über die Sprachmodelle bis hin zu den Aufgaben zu untersuchen, insbesondere indem wir die folgenden Fragen stellen. Erstens, wie bewerten wir die politische Ausrichtung der Sprachmodelle und welche Rolle spielen die Pronomen in diesem Zusammenhang? Zweitens, wie", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_121.wav", "doc_id": "uZBWfYjYnf.seg_121", "src_text": "Thanks for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank für Ihre Aufmerksamkeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_469.wav", "doc_id": "SUkmfOTvGi.seg_469", "src_text": "And when we develop new taggers, what is needed for good generalization?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und wenn wir neue Tags entwickeln, was ist für eine gute Verallgemeinerung erforderlich?", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_198.wav", "doc_id": "SLpqvupgvW.seg_198", "src_text": "Here's for example, the Google search result for the song \"Easy on Me.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier ist zum Beispiel das Google-Suchergebnis für das Lied \"Easy on Me\".", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_746.wav", "doc_id": "XejEJmgUmE.seg_746", "src_text": "So we can do the same thing by choosing unacceptable sentences from the same matching, and that could also be used to test the models acceptability.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir können also dasselbe tun, indem wir unakzeptable Sätze aus der gleichen Übereinstimmung auswählen, und das könnte auch verwendet werden, um die Akzeptabilität der Modelle zu testen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_828.wav", "doc_id": "GvEBWkLmuI.seg_828", "src_text": "Hi, I'm Myra and today I'll be talking about our paper \"Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "„Und heute sprechen wir über die Marktpersönlichkeiten, die sich auf natürliche Sprachmuster beziehen, in Sprachmodellen. Diese Arbeit", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_813.wav", "doc_id": "WTTtiRKFZI.seg_813", "src_text": "But what's novel in this paper is that we observed that this tendency only occurs when the governor is on the left or absent.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "diesem Dokument neu ist, ist, dass wir festgestellt haben, dass diese Tendenz nur auftritt, wenn der Gouverneur abwesend ist. In diesem Beispiel ist der", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_248.wav", "doc_id": "oYCKgTzTDy.seg_248", "src_text": "I think this is known as the \"Curse of Multilinguality\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ich glaube, das ist als Fluch der Mehrsprachigkeit bekannt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_215.wav", "doc_id": "oYCKgTzTDy.seg_215", "src_text": "Hello everyone, my name is Yusen Zhang from the Penn State University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo alle, mein Name ist Usman John von der Pennsylvania University.", "score": 9.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_450.wav", "doc_id": "hgIDlKNiFM.seg_450", "src_text": "In total, we have seven models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Insgesamt haben wir sieben Modelle.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_362.wav", "doc_id": "gGbuDbHhyc.seg_362", "src_text": "This indicates that WSL approaches actually require cleanly labeled data to work properly, and the annotation cost for obtaining clean validation samples should not be overlooked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies zeigt, dass WS-L-Verfahren tatsächlich sauber gekennzeichnete Daten erfordern, um richtig zu funktionieren, und dass die Annotierungskosten für die Beschaffung sauberer Validierungsmuster nicht unberücksichtigt werden sollten.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_44.wav", "doc_id": "aQpIWggfCo.seg_44", "src_text": "We hope the CoScript dataset can be a valuable resource to advance research on language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir hoffen, dass dieser Datensatz eine wertvolle Ressource für die Sprachplanung sein kann.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_397.wav", "doc_id": "WBLMIsdIrq.seg_397", "src_text": "To answer the first question, we started by measuring how much a word depends on context during translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Um die erste Frage zu beantworten, beginnen wir damit, wie viel ein Wort von der Übersetzung abhängt.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_36.wav", "doc_id": "aQpIWggfCo.seg_36", "src_text": "To ensure the quality of the validation and test set, we ask crowd-sourced workers to find and revise the incorrect samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "um die Güte der Validierung und Teststellen zu gewährleisten, und bitten Crowdsource-Arbeiter, die inkorrekten Proben zu überprüfen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_808.wav", "doc_id": "WTTtiRKFZI.seg_808", "src_text": "So what we did, we extracted various statistics about coordination from the enhanced version of the Penn Treebank and see the paper \"Why wouldn't you use universal dependencies\" and these statistics confirm the observation made many times before that left conjuncts tend to be shorter.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "haben wir aus der erweiterten Version der Pensionsbank und dem Papier, warum wir keine universellen Abhängigkeiten verwenden. Und diese Statistiken bestätigen die Beobachtung, die viele Male zuvor gemacht wurde,", "score": 15.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_45.wav", "doc_id": "aQpIWggfCo.seg_45", "src_text": "Thanks for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank für Ihre Zeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_559.wav", "doc_id": "rISrKoXQCx.seg_559", "src_text": "They occupy all four quadrants on the political campus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sie alle vier Quadranten des politischen Komplexes besetzen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_81.wav", "doc_id": "TVCREhgqUP.seg_81", "src_text": "Our model outperforms the others by a large margin on generalization to deeper recursion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Kog's-Benchmark. Unsere Methode übertrifft die anderen um einen großen Vorsprung bei der Verallgemeinerung und", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_289.wav", "doc_id": "PIZEXUFLAR.seg_289", "src_text": "If the task is a multi-model classification task, we report accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn es sich bei der Aufgabe um eine multimodale Klassifizierungsaufgabe handelt, geben wir die Genauigkeit an,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_205.wav", "doc_id": "SLpqvupgvW.seg_205", "src_text": "The AltEntities Corpus has 6,000 alternative questions across three domains, and it has 42,000 indirect referring expressions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Das Korpus der Identitätsalternative enthält 6.000 alternative Fragen in drei Domänen und 42.000 indirekte Referenzausdrücke.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_566.wav", "doc_id": "rISrKoXQCx.seg_566", "src_text": "So we divide pretraining corpora, into pre 45th president of the United States and after 45th president of the United States.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "teilen wir die Vorbereitung von Korpora in zwei Korpora vor und nach 1955 in die Vereinigten Staaten", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_468.wav", "doc_id": "SUkmfOTvGi.seg_468", "src_text": "Firstly, can these models generalise to modern data?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Erstens: Können diese Modelle auf moderne Daten verallgemeinert werden?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_496.wav", "doc_id": "SUkmfOTvGi.seg_496", "src_text": "And we found that the answer is actually a resounding yes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wir fanden heraus, dass die Antwort tatsächlich eindeutig lautet: \"Ja\".", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_369.wav", "doc_id": "gGbuDbHhyc.seg_369", "src_text": "As we can see from the figures, the vanilla model, termed FTw, initially underperforms more complicated WSL methods, like COSINE.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wie aus den Zahlen hervorgeht, unterperformt das Vallina-Modell, das Ftw genannt wird, zunächst komplexere WSL-Methoden wie Kosine.", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_767.wav", "doc_id": "XejEJmgUmE.seg_767", "src_text": "So, the key takeaways of our work is that language models are sensitive to latent syntactic and semantic features which are shared across the sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "sind die Schlüsselmerkmale unserer Arbeit, dass Sprachmodelle gegenüber latenten syntaktischen und semantischen Merkmalen, die über die Sätze hinweg geteilt werden, empfindlich", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_9.wav", "doc_id": "aQpIWggfCo.seg_9", "src_text": "A good planner should write scripts that are reasonable and faithful to constraints.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ein guter Planer sollte Skripte schreiben, die vernünftig und konsequent sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_393.wav", "doc_id": "WBLMIsdIrq.seg_393", "src_text": "And some people have suggested targeted evaluation on context-dependent translations, but these resources only support limited types of context-dependent translations and limited sets of languages since they usually rely on domain knowledge and human curation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und einige Leute haben eine gezielte Auswertung von kontextsensitiven Übersetzungen vorgeschlagen, aber diese Ressourcen unterstützen nur begrenzte Arten von kontextsensitiven Übersetzungen und begrenzte Sätze von Sprachen, da sie sich normalerweise auf die Domäne des Wissens und der menschlichen Kreativität", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_649.wav", "doc_id": "FLkGnzVRew.seg_649", "src_text": "On collecting around 1,000 examples of discourse unit pairs, we ran training for an initial classifier trained only on 43 examples of dissonance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "finden. Bei der Sammlung von Tausenden von Beispielen von Diskurs-Einheiten trainieren wir für einen anfänglichen Klassifizierer und trainieren nur an vierunddreißig Beispielen", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_296.wav", "doc_id": "PIZEXUFLAR.seg_296", "src_text": "Here we can see, as the amount of task increases, the model achieves better performance and in the meantime, lower sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier sehen wir, wie das Modell mit zunehmender Aufgabenmenge eine bessere Leistung und eine geringere Sensibilität erreicht.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_678.wav", "doc_id": "oaOHnMCwad.seg_678", "src_text": "You might turn towards a popular API like Prospective API for toxicity detection, and this works really well if you're Carl Jones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Sie könnten sich einer beliebten API zuwenden, wie z.B. eine API für die Erkennung von Toxizität, und das funktioniert wirklich gut, wenn Sie Carl Jones sind,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_151.wav", "doc_id": "wLqFAuDnKa.seg_151", "src_text": "In our case, we chose to evaluate with Google Translate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "an einem kommerziellen System. Wir haben uns entschieden, Google Translate zu verwenden.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_235.wav", "doc_id": "oYCKgTzTDy.seg_235", "src_text": "And we test Multilingual Model which we train one multilingual model for all languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und es hat ein multilinguales Modell, das wir für alle Sprachen trainieren.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_12.wav", "doc_id": "aQpIWggfCo.seg_12", "src_text": "As shown in the table, we extend the abstract goals with multi-faceted constraints for human-in-the-loop data acquisition using InstructGPT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wie in der Tabelle gezeigt, erweitern wir die abstrakten Ziele mit mehrstufigen Einschränkungen, damit Menschen, die die Datenerfassung durch die Lupe verwenden, die Inst. gpt. verwenden.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_295.wav", "doc_id": "PIZEXUFLAR.seg_295", "src_text": "Also, transfer learning from natural instruction dataset can benefit instruction tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Übertragen des Lernens von natürlichen Instruktionsdatensätzen kann das Abstimmen von Anweisungen verbessern.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_710.wav", "doc_id": "oaOHnMCwad.seg_710", "src_text": "So for the GPT 4 social acceptability analysis, we find that it's most aligned to confucian and English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Position handelt. Beispielsweise finden wir, dass es sich bei den Datensätzen um die meisten englischsprachigen Länder", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_355.wav", "doc_id": "gGbuDbHhyc.seg_355", "src_text": "First, is clean validation data necessary for WSL or can we maybe use a noisy validation set instead?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Erstens: Sind saubere Validierungsdaten für WSL erforderlich? Oder können wir stattdessen einen lauten Validierungssatz verwenden? Zweitens, wenn", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_417.wav", "doc_id": "WBLMIsdIrq.seg_417", "src_text": "We can then also note that different languages have different proportions of these discourse phenomena.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir können dann auch feststellen, dass verschiedene Sprachen unterschiedliche Aspekte dieser Diskursphänomene haben.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_770.wav", "doc_id": "XejEJmgUmE.seg_770", "src_text": "Thank you for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Experimenten.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_498.wav", "doc_id": "SUkmfOTvGi.seg_498", "src_text": "And lastly, please make sure to check out our paper, our data set and if you have any questions, feel free to contact me.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und zuletzt, bitte stellen Sie sicher, dass Sie unser Papier und unseren Datensatz überprüfen, und wenn Sie Fragen haben, zögern Sie nicht, mich zu kontaktieren.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_810.wav", "doc_id": "WTTtiRKFZI.seg_810", "src_text": "And, also the observation that was made in parsing that this tendency grows with length difference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und außerdem die Beobachtung, dass sich eine Tendenz mit Längenunterschieden entwickelt. Also", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_286.wav", "doc_id": "PIZEXUFLAR.seg_286", "src_text": "Each instance is randomly combined with one of its five instruction templates.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Jede Instanz wird zufällig mit einem der fünf Anweisungstemplate kombiniert.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_779.wav", "doc_id": "WTTtiRKFZI.seg_779", "src_text": "Now those are asymmetric approaches to coordinate structures, such as the Prague approach.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "jetzt setzen sie auch symmetrische Ansätze für koordinierte Strukturen wie den pragmatischen Ansatz,", "score": 15.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_184.wav", "doc_id": "SLpqvupgvW.seg_184", "src_text": "The second one, which is the alternative question is generated as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die zweite, die alternative Frage, wird wie folgt generiert:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_284.wav", "doc_id": "PIZEXUFLAR.seg_284", "src_text": "So we use pre-trained OFA large model as a base model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir verwenden also das vorgefertigte OAF-Large-Modell als Basismodell.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_247.wav", "doc_id": "oYCKgTzTDy.seg_247", "src_text": "We found it is because most of the major natural languages can obtain performance gain, except that English performance drops in seven datasets and only gains in three datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und es wurde festgestellt, dass die meisten natürlichen Sprachen Leistungssteigerungen erzielen können, außer dass die Leistung von Englisch in sieben Datensätzen sinkt und nur in drei Datensätzen steigt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_323.wav", "doc_id": "dJGfOSFgZO.seg_323", "src_text": "To determine what kind of evaluation is most effective, we selected four state-of-the-art chat models and evaluated them on 100 human-bot conversations per model using ABC-Eval.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um zu bestimmen, welche Art der Bewertung am effektivsten ist, haben wir vier Chat-Modelle ausgewählt und sie an einhundert menschlichen Gesprächen pro Modell mit A.B.E. ausgewertet.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_658.wav", "doc_id": "FLkGnzVRew.seg_658", "src_text": "Next, we determine the best method to update a model with new data from each round of active learning and annotations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Als nächstes werden wir die beste Methode bestimmen, um ein Modell mit neuen Daten aus jeder Runde des aktiven Lernens und der Anmerkungen zu aktualisieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_586.wav", "doc_id": "rISrKoXQCx.seg_586", "src_text": "Ok, great.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ok, großartig,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_655.wav", "doc_id": "FLkGnzVRew.seg_655", "src_text": "We find that on transferring the zero-shot performance on the annotated data set is already much better than chance with the best, with AUC .62.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir stellen fest, dass die Übertragung der Null-Shot-Performance des annotierten Datensatzes schon viel besser ist als die Chance mit dem besten UC Point", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_141.wav", "doc_id": "wLqFAuDnKa.seg_141", "src_text": "It's crucial for zero and one-shot prompting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Es ist für Null und eins Prompts,", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_108.wav", "doc_id": "uZBWfYjYnf.seg_108", "src_text": "This means that the first two words will be emitted while since the sum of the cross-attention is above a certain threshold alpha, we will not emit the last word and we wait for another speech chunk.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das bedeutet, dass die ersten beiden Wörter ausgeklammert werden. Während die Summe der Kreuzverhöre über einem bestimmten Grenzwert liegt, werden wir das letzte Wort nicht verlieren und auf einen anderen Redner warten.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_283.wav", "doc_id": "PIZEXUFLAR.seg_283", "src_text": "In addition, we randomly sample 20 tasks from the test split of natural instructions as an unseen task for NLP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "jede Aufgabe. Zufällig wurden 1000 Aufgaben aus der Testmenge von Natural Instruction als \"unbekannte\" Aufgaben für NLP ausgewählt.", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_671.wav", "doc_id": "FLkGnzVRew.seg_671", "src_text": "These are the links to our core data set and our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese sind die Links zu Ihrem Code-Datensatz und Ihrem Papier.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_709.wav", "doc_id": "oaOHnMCwad.seg_709", "src_text": "For example, we find that data sets and models are most aligned to English speaking countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "stellen wir fest, dass die Datenmodelle am meisten mit englischsprachigen Ländern übereinstimmen, so", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_171.wav", "doc_id": "SLpqvupgvW.seg_171", "src_text": "Here are some examples of indirect references for example, \"the newer one\" or \"the song that's not energetic.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "gibt es einige Beispiele für direkte Referenzen, zum Beispiel die neuere oder die nicht energiereiche Melodie.", "score": 19.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_110.wav", "doc_id": "uZBWfYjYnf.seg_110", "src_text": "This means that these three words will be emitted.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Das bedeutet, dass diese drei Wörter ausgelassen werden.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_851.wav", "doc_id": "GvEBWkLmuI.seg_851", "src_text": "And more broadly, dominant groups in society are both linguistically and socially unmarked, while the marginalized groups are usually marked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Im Allgemeinen sind dominante Gruppen in der Gesellschaft sowohl sprachlich als auch sozial unmarkiert, während die marginalisierten Gruppen normalerweise markiert sind.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_568.wav", "doc_id": "rISrKoXQCx.seg_568", "src_text": "We can see that language models generally had a political leaning that is further away from the centre after 2017.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir können die Sprachmodelle im Allgemeinen sehen, die nach dem Zentrum weggehen, nachdem sie 27 Jahre lang", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_344.wav", "doc_id": "gGbuDbHhyc.seg_344", "src_text": "I'd like to begin with a brief introduction to weak supervision and weakly supervised learning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ich möchte mit einer kurzen Einführung in die Woche-Überwachung und die wöchentlich überwachte Lernmethode beginnen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_581.wav", "doc_id": "rISrKoXQCx.seg_581", "src_text": "It's like between Scylla and Charybdis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "B. zwischen Sila und Kribidis.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_316.wav", "doc_id": "dJGfOSFgZO.seg_316", "src_text": "One approach is to simply ask human judges to evaluate several dimensions of dialogue quality, such as the relevance of model responses using existing comparative or Likert scale methods.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Eine Vorgehensweise besteht darin, Menschenrichter einfach zu bitten, mehrere Dimensionen der Dialogqualität zu bewerten, wie z. B. die Relevanz von Modellantworten, mit existierenden vergleichenden oder Likert-Skalen.", "score": 48.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_481.wav", "doc_id": "SUkmfOTvGi.seg_481", "src_text": "We found that usually larger models lead to better generalization.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellten fest, dass größere Modelle zu einer besseren Generalisierung führen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_90.wav", "doc_id": "TVCREhgqUP.seg_90", "src_text": "We approximate this with a GPU-friendly continuous relaxation that also allows us to backpropagate through the solution and learn the linguistically more plausible permutations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir nähern uns diesem mit einer GPU-freundlichen, kontinuierlichen Relaxation, die es uns auch ermöglicht, durch die Lösung zurückzugehen und die linguistisch plausibleren Permutationen zu lernen.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_512.wav", "doc_id": "dvGkKzmIaN.seg_512", "src_text": "First the method should be applicable to embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zuerst die Methode sollte anwendbar sein auf Einbettungen und Services:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_861.wav", "doc_id": "GvEBWkLmuI.seg_861", "src_text": "In our analysis, we reveal how these seemingly positive portrayals reflect harmful patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In unserer Analyse zeigen wir, wie diese scheinbar positiven Porträts schädliche Muster widerspiegeln.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_819.wav", "doc_id": "WTTtiRKFZI.seg_819", "src_text": "However, when the governor is on the right, as here, \"laughed\" governs the coordination Ted and Ned, this effect disappears.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "auf der rechten Seite ist, wie hier links, dann wird die Koordination netterweise übernommen. Durch die Messung der", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_663.wav", "doc_id": "FLkGnzVRew.seg_663", "src_text": "We find that the proposed PRC strategy works better than other state-of-the-art strategies, although the difference is small.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir stellen fest, dass die vorgeschlagene PRC-Strategie besser funktioniert als andere Strategien, obwohl die Unterschiede gering sind,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_310.wav", "doc_id": "dJGfOSFgZO.seg_310", "src_text": "And today we'll tell you all about ABC-Eval, a new dimensional approach to evaluating conversational AI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und heute werden wir Ihnen alles über AVECVAL erzählen, eine neue dimensionale Herangehensweise zur Bewertung von konversationaler KI.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_827.wav", "doc_id": "WTTtiRKFZI.seg_827", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "über.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_569.wav", "doc_id": "rISrKoXQCx.seg_569", "src_text": "So this indicates that language models can also pick up the polarisation in our society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die politische Linie gehalten haben, so dass dies anzeigt, dass Sprachmodelle auch die Art Polarisierung in unserer Gesellschaft annehmen können.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_291.wav", "doc_id": "PIZEXUFLAR.seg_291", "src_text": "We also introduce an additional evaluation metric called sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben auch eine zusätzliche Bewertungsmetrik namens Sensibilität eingeführt, die", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_844.wav", "doc_id": "GvEBWkLmuI.seg_844", "src_text": "Our prompts to generate these personas were inspired by a study where they gave these prompts to human subjects, finding that by giving it to human subjects, they also were able to surface racial stereotypes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unsere Prompts, um diese Personas zu generieren, wurden von einer Studie inspiriert, in der sie diese Prompts für menschliche Subjekte gaben und feststellten, dass sie auch in der Lage waren, rassistische Stereotypen hervorzurufen. Und diese", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_659.wav", "doc_id": "FLkGnzVRew.seg_659", "src_text": "\"Cumulative\" accumulates all the data collected from active annotation so far, whereas \"Iterative\" updates the model by training on the latest set of data collected.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Cumulative akkumuliert alle bislang gesammelten Daten aus aktiven Annotationen, während iterativ das Modell durch Training auf dem letzten Satz gesammelter Daten aktualisiert.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_735.wav", "doc_id": "XejEJmgUmE.seg_735", "src_text": "And in this, minimal pair paradigm, the typical way to evaluate language models is that you show like an acceptable sentence or a grammatical sentence and then you show an acceptable sentence or an ungrammatical sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "umfassen können. In diesem Minimalpaar-Paradigma ist die typische Methode zur Bewertung von Sprachmodellen, dass Sie zeigen, Beispielsweise eine akzeptable Satz oder ein grammatikalischer Satz, und dann zeigen Sie einen unakzeptablen Satz oder einen ungrammatischen Satz an, und", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_264.wav", "doc_id": "PIZEXUFLAR.seg_264", "src_text": "So with the advances in large language models, many works started to explore new learning paradigms of reusing pre-trained language models for different downstream tasks in a parameter and data-efficient way.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Mit den Fortschritten in großen Sprachmodellen begannen viele Arbeiten, neue Lernparadigmen für die Wiederverwendung von trainierten Sprachmodellen für unterschiedliche Aufgaben in einem Parameter und in einer datenintensiven Weise zu", "score": 46.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_43.wav", "doc_id": "aQpIWggfCo.seg_43", "src_text": "We use large language models to generate a high-quality script dataset, CoScript, for constrained language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir verwenden große Sprachmodelle, um ein hochwertiges Datensatz für die Sprachplanung zu generieren, den „CoScript“.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_169.wav", "doc_id": "SLpqvupgvW.seg_169", "src_text": "Or the pronunciations are too similar to each other and hard to disambiguate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "weiß. Alle Aussprachen sind zu ähnlich und schwer zu unterscheiden.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_458.wav", "doc_id": "hgIDlKNiFM.seg_458", "src_text": "Which is not the case for the model based on CamemBERT weights and tokenizer, which suffer from stability issues.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies ist nicht der Fall für das Modell auf Camembert White und Tokanether, die an Stabilitätsprobleme leiden. Schließlich, als", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_677.wav", "doc_id": "oaOHnMCwad.seg_677", "src_text": "So let's start off by imagining that you're working for a newspaper and you're sifting through comments under your news article trying to remove toxic content.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Beginnen wir also damit, uns vorzustellen, dass Sie für eine Zeitung arbeiten und Kommentare zu Ihrem Artikel schreiben, um toxischen Inhalt zu entfernen.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_454.wav", "doc_id": "hgIDlKNiFM.seg_454", "src_text": "However, we can observe that data from heterogeneous sources appear to be more versatile.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir können jedoch erkennen, dass Daten aus heterogenen Quellen wahrscheinlich mehr Verschmelzungsfähigkeit aufweisen,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_566.wav", "doc_id": "rISrKoXQCx.seg_566", "src_text": "So we divide pretraining corpora, into pre 45th president of the United States and after 45th president of the United States.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die Korpora vor dem 45. Präsidenten der Vereinigten Staaten und nach dem 45. Präsidenten der Vereinigten Staaten", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_236.wav", "doc_id": "oYCKgTzTDy.seg_236", "src_text": "For example, we put the German, English, Chinese queries together to train a multilingual model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "z.B. indem wir die deutschen, englischen und chinesischen Fragen zusammenfassen, um ein multilinguales Modell zu trainieren,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_611.wav", "doc_id": "oeooqChmKK.seg_611", "src_text": "We have defined three settings of KITMUS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben drei Einstellungen von Kidmo definiert.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_289.wav", "doc_id": "PIZEXUFLAR.seg_289", "src_text": "If the task is a multi-model classification task, we report accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn die Aufgabe eine multimodale Klassifikation ist, berichten wir über Genauigkeit, wenn", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_652.wav", "doc_id": "FLkGnzVRew.seg_652", "src_text": "To alleviate this, we experiment over combinations of transfer learning and active learning to annotate such that more dissonant samples can be collected over lesser annotation runs, lowering the overall annotation costs while improving dissonance detection.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um dies zu verbessern, werden die Experimente über Kombinationen von Transferlernen und aktivem Lernen durchgeführt, sodass mehr disjunkte Proben über niedrigere Annotationsrunden gesammelt werden können, um die Gesamtkosten der Annotation durch verbesserte Disjunktion zu senken.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_581.wav", "doc_id": "rISrKoXQCx.seg_581", "src_text": "It's like between Scylla and Charybdis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "zwischen den Sprachräumen liegt.", "score": 6.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_221.wav", "doc_id": "oYCKgTzTDy.seg_221", "src_text": "For instance, there are lots of coverage on certain natural languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "beispielsweise. Es gibt Lücken in der Berichterstattung über bestimmte natürliche Sprachen.", "score": 19.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_761.wav", "doc_id": "XejEJmgUmE.seg_761", "src_text": "Now this and this is very large like this effect, increases throughout the context length and this would probably affect like newer language models which has large context window.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dieser Effekt ist sehr groß, er vergrößert sich über die Kontextlänge und würde wahrscheinlich neue Sprachmodelle mit großen Kontextfenstern beeinflussen.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_21.wav", "doc_id": "aQpIWggfCo.seg_21", "src_text": "Thus, we adopt the idea of over-generate-then-filter to improve generation quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Daher haben wir die Idee der Übererzeugung des Filters zur Verbesserung der Generationsqualität übernommen. Zunächst zeigen", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_321.wav", "doc_id": "dJGfOSFgZO.seg_321", "src_text": "ABC-Eval is capable of measuring the rates at which chat models will commit various thematic errors.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ist in der Lage, die Raten zu messen, bei denen die Modelle verschiedene thematische Fehler aufweisen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_70.wav", "doc_id": "TVCREhgqUP.seg_70", "src_text": "After the first step, we have all the right tokens, but they're not ordered.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Nach dem ersten Schritt haben wir alle richtigen Token, aber sie sind nicht bestellt.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_789.wav", "doc_id": "WTTtiRKFZI.seg_789", "src_text": "So \"Marge read it yesterday\" is fine because the direct object is close to the verb, while \"Marge read yesterday it\" is much worse.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "denn so ist es heute gut, weil das direkte Objekt der Verfremdung unterworfen ist. Es ist viel schlimmer, wenn man es rot", "score": 4.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_539.wav", "doc_id": "dvGkKzmIaN.seg_539", "src_text": "The results on four data sets show that our embedding marker can have great detection performance while keep great utility for downstream tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Ergebnisse auf vier Datensätzen zeigen, dass unser eingebetteter Marker eine großartige Detektionsleistung haben kann, während er für Downstream-Aufgaben eine großartige Nutzbarkeit bietet.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_449.wav", "doc_id": "hgIDlKNiFM.seg_449", "src_text": "Another also based on CamemBERT, but trained this time on the 4 GB of clinical notes and finally, one based on English biomedical model PubMedBERT, and trained on 4 GB of set of NACHOS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "basiert ebenfalls auf Camber, trainiert aber dieses Mal auf vier Gigabytes von Clinch. Insgesamt haben wir sieben Modelle, von denen eines auf dem englischen Biomedical-Modell basiert, und vier Gigabytes an Natur.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_399.wav", "doc_id": "WBLMIsdIrq.seg_399", "src_text": "And this is done by measuring how much information the context C provides about the target Y, given the source X. You can think of CXMI as the information gained from giving context to the model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und dies dadurch erreicht, dass wir messen, wie viel Informationen der Kontext C über das Ziel Y liefert, gegeben die Quelle X. Sie können CxMI als die von der Kontextnutzung gewonnene Information denken.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_448.wav", "doc_id": "hgIDlKNiFM.seg_448", "src_text": "One based on the weight of CamemBERT and trained on a 4 GB set of NACHOS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ein basiert auf dem Gewicht von Camembert und trainiert auf vier Gigabyte von NACHOS. Wir haben", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_196.wav", "doc_id": "SLpqvupgvW.seg_196", "src_text": "So what we do is that we show some background knowledge about the two entities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Also was wir tun, ist, dass wir etwas Hintergrundwissen über die Zwergsterne zeigen;", "score": 33.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_162.wav", "doc_id": "SLpqvupgvW.seg_162", "src_text": "Our goal is to understand users’ language when they want to make a choice.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unser Ziel ist es, die Sprache des Benutzers zu verstehen, wenn er eine Wahl treffen möchte.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_720.wav", "doc_id": "oaOHnMCwad.seg_720", "src_text": "And the other is to do NLP research with the lens of perspectivism.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und die zweite ist, NLP-Forschung mit dem Perspektivismus zu machen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_366.wav", "doc_id": "gGbuDbHhyc.seg_366", "src_text": "The right figure shows the performance difference between fine-tuning approaches, which are directly applied on the clean data, and WSL approaches, which use the clean data for validation only.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Das rote Feld zeigt den Leistungsdifferenz zwischen Fine-Tuning-Ansätzen, die direkt auf saubere Daten angewandt werden, und WS-L-Ansätzen, die die sauberen Daten nur zur Validierung verwenden.", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_845.wav", "doc_id": "GvEBWkLmuI.seg_845", "src_text": "And also this enables direct comparison between our generated personas and the human written responses.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "können auch direkt zwischen unseren generierten Personen und den menschlichen Antworten verglichen werden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_524.wav", "doc_id": "dvGkKzmIaN.seg_524", "src_text": "We assume the provider can collect a general text corpus and count the word frequency with it.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir nehmen an, dass der Anbieter einen allgemeinen Textkörper sammeln und die Wortfrequenz damit zählen kann. Bei", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_556.wav", "doc_id": "rISrKoXQCx.seg_556", "src_text": "So specifically, we first proposed to prompt language models with different prompt formats using the political questionnaires such as the political conference test.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "schlagen wir zuerst vor, zwei sprachliche Modelle zu präsentieren, mit unterschiedlichen präsentierten Formaten, die die politischen Fragen verwenden, wie", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_770.wav", "doc_id": "XejEJmgUmE.seg_770", "src_text": "Thank you for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Experimenten.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_595.wav", "doc_id": "oeooqChmKK.seg_595", "src_text": "Pretrained parameters can contain information about what presidents do and what a TV is but they cannot reliably know who this instance-specific entity \"John\" is, or who the new president is, because the president might have changed since pretraining.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Parameter können Informationen darüber enthalten, was Präsidenten tun und wer Teller ist, aber sie können nicht zuverlässig wissen, wer diese spezifische Einheit ist, John, oder wer der neue Präsident ist, weil der Präsident sich seit seiner Vorbereitung geändert haben könnte.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_181.wav", "doc_id": "SLpqvupgvW.seg_181", "src_text": "And in the third speech bubble, Bob uses an indirect reference to select one of these entities, for example, \"the newer one.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Frage, und in der dritten Rede-Sphäre verwendet Bob indirekte Referenzen, um eine dieser Entitäten auszuwählen, zum Beispiel das neue", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_193.wav", "doc_id": "SLpqvupgvW.seg_193", "src_text": "And finally when they have similar info boxes or attributes on Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und schließlich, wenn sie ähnliche Infoboxen oder Attribute auf Wikipedia haben,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_653.wav", "doc_id": "FLkGnzVRew.seg_653", "src_text": "Since the initial model was not able to capture the dissonance class at all, we start the active learning process by transferring weights from closely related tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Da das ursprüngliche Modell nicht in der Lage war, die Fernlehre zu erlernen, begannen wir mit dem aktiven Lernprozess, indem wir Gewichte von eng beieinander liegenden Aufgaben übertrugen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_223.wav", "doc_id": "oYCKgTzTDy.seg_223", "src_text": "The Lambda calculus is missing, or they're only evaluated on certain neural models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Lamda-Coclyles ist verschwunden. Oder sie werden nur auf bestimmten neueren Modellen bewertet;", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_48.wav", "doc_id": "TVCREhgqUP.seg_48", "src_text": "My name is Matthias Lindemann, and today I'm going to give you a brief introduction to our paper on \"Compositional Generalization without Trees using Multiset Tagging and Latent Permutations\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "mein Name ist Matthias Lindemann und heute werde ich Ihnen eine kurze Einführung in unser Papier über kompositionelle Generalisierung ohne Bäume mit Multi-Cell-Tagging und latenten Permutationen geben.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_666.wav", "doc_id": "FLkGnzVRew.seg_666", "src_text": "We also check the feasibility of each strategy for annotation quality and costs to annotators.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir prüfen auch die Machbarkeit jeder Strategie für die Qualität der Annotation und die Kosten für Annotatoren.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_395.wav", "doc_id": "WBLMIsdIrq.seg_395", "src_text": "First, when does translation require context?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Erstens: Wann benötigt eine Übersetzung einen", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_694.wav", "doc_id": "oaOHnMCwad.seg_694", "src_text": "The first step is to re annotate data sets with diverse annotators.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Der erste Schritt besteht darin, Datensätze mit verschiedenen Annotatoren zu annotieren.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_209.wav", "doc_id": "SLpqvupgvW.seg_209", "src_text": "If the language model has access to some partially overlapping background knowledge, then the accuracy is between 82 to 87%, which is more realistic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn das Sprachmodell Zugriff auf ein teilweise überlappendes Hintergrundwissen hat, liegt die Genauigkeit zwischen 82 und 87 Prozent, was realistischer ist,", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_764.wav", "doc_id": "XejEJmgUmE.seg_764", "src_text": "And after doing like several of these perturbations, we find that none of these noises are actually making the model like change its course in terms of how it shows us the MPP judgement print.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und danach mehrere dieser Störungen durchführen. Wir stellen fest, dass keiner dieser Geräusche den Kurs des Modells tatsächlich ändert, in Bezug darauf, wie es uns den Trend des MP-Judikats zeigt. Grundsätzlich", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_647.wav", "doc_id": "FLkGnzVRew.seg_647", "src_text": "Tweets were passed using the PDTB parser, and pairs of discourse units were annotated according to the guidelines that are described in our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist. Tweets wurden mit einem PDT-Parser übertragen und einige Diskussionsbeiträge wurden entsprechend den Richtlinien, die in unserem Papier beschrieben sind, annotiert.", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_177.wav", "doc_id": "SLpqvupgvW.seg_177", "src_text": "In the first bubble, Bob says, \"Remember that song we were listening to yesterday?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In der ersten Blase sagt Bob: „Denkst du daran, an das Lied, das wir gestern gehört haben?“", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_459.wav", "doc_id": "hgIDlKNiFM.seg_459", "src_text": "Finally, as a conclusion our proper system offered better performance on nine of the 11 downstream tasks and surpassed globally the result of the generic model, here CamemBERT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "stability issues. Schließlich bietet unser System eine bessere Leistung bei neun der elf nicht-trivialen Aufgaben und übertrifft global das Ergebnis des allgemeinen Modells hier, Camembert. Wir beobachten auch,", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_404.wav", "doc_id": "WBLMIsdIrq.seg_404", "src_text": "We perform our analysis at three different levels.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir führen unsere Analysen auf drei verschiedenen Ebenen durch.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_12.wav", "doc_id": "aQpIWggfCo.seg_12", "src_text": "As shown in the table, we extend the abstract goals with multi-faceted constraints for human-in-the-loop data acquisition using InstructGPT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wie in der Tabelle gezeigt, erweitern wir die abstrakten Ziele mit mehrfach konstruktiven Einschränkungen für die Menschen, die im Look-up-Datenbesitz den instruktiven GPT verwenden.", "score": 46.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_720.wav", "doc_id": "oaOHnMCwad.seg_720", "src_text": "And the other is to do NLP research with the lens of perspectivism.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ist es die Lp-Forschung der Perspektiven. Die", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_101.wav", "doc_id": "uZBWfYjYnf.seg_101", "src_text": "First, to use already existing offline ST models without re-training or adopting specific architecture for SimulST.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Erstens verwenden Sie bereits existierende Offline-Modelle, ohne eine spezifische Architektur für Civils zu", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_16.wav", "doc_id": "aQpIWggfCo.seg_16", "src_text": "Then we conduct detailed analysis to investigate why learning models fail.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dann führen wir detaillierte Analysen durch, um zu erforschen, wofür Land-Level-Modelle dienen. Die Ergebnisse", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_465.wav", "doc_id": "SUkmfOTvGi.seg_465", "src_text": "Let's get started.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sie uns anfangen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_108.wav", "doc_id": "uZBWfYjYnf.seg_108", "src_text": "This means that the first two words will be emitted while since the sum of the cross-attention is above a certain threshold alpha, we will not emit the last word and we wait for another speech chunk.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "die Lambda-Sprechrahmen. Dies bedeutet, dass die ersten beiden Wörter ausgelassen werden, während die Summe Wenn die Aufmerksamkeit auf einen bestimmten Themenbereich alpha fokussiert ist, werden wir das letzte Wort nicht aussprechen und auf einen anderen Sprechabschnitt warten.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_843.wav", "doc_id": "GvEBWkLmuI.seg_843", "src_text": "The first one is generating these personas.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "besteht, und die erste erzeugt diese Personen.", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_760.wav", "doc_id": "XejEJmgUmE.seg_760", "src_text": "But when we match the structure, that is when we choose the sentences from the same phenomena in BLiMP or SyntaxGym, we see a massive increase or a massive decrease of the MPP judgement for the model, depending on whether the chosen prefix is acceptable or unacceptable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Aber wenn wir die Struktur abgleichen, also wenn wir die Sätze aus demselben Phänomen in Blame Person Taxonomy BPT auswählen, sehen wir einen massiven Anstieg oder einen massiven Rückgang der Bewertung des Modells für die MPP, je nachdem, ob der ausgewählte Präfix akzeptabel oder nicht akzeptabel ist.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_557.wav", "doc_id": "rISrKoXQCx.seg_557", "src_text": "This ensures us to do automatic evaluation well grounded in political science literature.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "eine automatische Bewertung in der politischen Literatur sicherstellt. So zeigen", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_767.wav", "doc_id": "XejEJmgUmE.seg_767", "src_text": "So, the key takeaways of our work is that language models are sensitive to latent syntactic and semantic features which are shared across the sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das Schlüsselwort unserer Arbeit ist also, dass Sprachmodelle gegenüber latenten syntaktischen und semantischen Merkmalen empfindlich sind, die über die Sätze hinweg geteilt werden.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_739.wav", "doc_id": "XejEJmgUmE.seg_739", "src_text": "So it's crucial that we evaluate the models' acceptability throughout the context window and that is what we are trying to do here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "auf, also ist es entscheidend, dass wir die Akzeptanz der Modelle auf der gesamten Kontextfenster bewerten und das ist, was wir hier versuchen zu tun:", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_68.wav", "doc_id": "TVCREhgqUP.seg_68", "src_text": "Our approach predicts the output from the input in two steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unsere Herangehensweise prognostiziert die Ausgabe aus der Eingabe in zwei Schritten Erstens", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_60.wav", "doc_id": "TVCREhgqUP.seg_60", "src_text": "A popular method to address this is to integrate trees into the models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Eine beliebte Methode, dies anzusprechen, ist die Integration von Bäumen in die Modelle.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_743.wav", "doc_id": "XejEJmgUmE.seg_743", "src_text": "So for example, here we have chosen like a typical pair of grammaticality from the BLiMP data set from the Adjunct Island case.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier haben wir beispielsweise typische Paare von Grammatik aus dem Blimp-Datensatz aus dem Fall der angrenzenden Insel ausgewählt.", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_135.wav", "doc_id": "wLqFAuDnKa.seg_135", "src_text": "The difference observed is of more than one BLEURT points.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sich um mehr als einen Blip.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_154.wav", "doc_id": "wLqFAuDnKa.seg_154", "src_text": "So, it seems that PaLM chooses to produce a better-sounding translation, sometimes by dropping parts of the source sentence that are made in translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Palme entscheidet sich also manchmal für eine bessere Übersetzung, indem er Teile des Originalsatzes weglässt.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_698.wav", "doc_id": "oaOHnMCwad.seg_698", "src_text": "Our frame is largely enabled through Lab in the Wild and online crowdsourcing platform for where HCI collaborator.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unser Framework wird in erster Linie durch Lab in the Wild, eine Online-Plattform für Crowdsourcing, aktiviert, die ein ehemaliger HCI-Kollaborator ist.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_11.wav", "doc_id": "aQpIWggfCo.seg_11", "src_text": "Since no dataset of specific goals exists to support our study, we have to acquire these goals first.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die unsere Studien belegen. Wir müssen diese Ziele zuerst erlangen,", "score": 6.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_519.wav", "doc_id": "dvGkKzmIaN.seg_519", "src_text": "Then let me introduce the details of our embedding marker.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann lassen Sie mich die Einzelheiten unseres eingebetteten Markers erläutern.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_716.wav", "doc_id": "oaOHnMCwad.seg_716", "src_text": "We find this in the GPT 4 social acceptability task as well as the Dynahate task analysis as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir finden dies im GPT-4-Social-Acceptability-Tasks. Auch die Analyse der Task-Performance ist ebenfalls", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_717.wav", "doc_id": "oaOHnMCwad.seg_717", "src_text": "So, given that there is positionality in NLP, what can we do about it?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "möglich. Also, da es eine Position in der NLP gibt, was können wir tun?", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_601.wav", "doc_id": "oeooqChmKK.seg_601", "src_text": "Servin is a judge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Servin ist ein", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_247.wav", "doc_id": "oYCKgTzTDy.seg_247", "src_text": "We found it is because most of the major natural languages can obtain performance gain, except that English performance drops in seven datasets and only gains in three datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "man es herausfindet, ist es, weil die meisten der wichtigsten natürlichen Sprachen Leistungsgewinne erzielen können, mit Ausnahme, dass die Leistung des Englischen in sieben Datensätzen sinkt und nur in drei Datensätzen steigt.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_548.wav", "doc_id": "rISrKoXQCx.seg_548", "src_text": "So language models are trained on large scale web crawl data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "die zu unfairen und ungenauen Modellen führen. Die Sprachmodelle werden auf großen Webdatensätzen", "score": 49.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_375.wav", "doc_id": "gGbuDbHhyc.seg_375", "src_text": "First, report the model selection criteria.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Berichten Sie zunächst die Kriterien für die Modellauswahl; zum", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_197.wav", "doc_id": "SLpqvupgvW.seg_197", "src_text": "For songs, we simply show a Google search link to each song and then ask the annotators to listen to at least some of each song, and read about each song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "für einige Lieder zeigen wir einfach einen Google-Suchlink zu jedem Lied Und dann bitte ich die Annotatoren, sich wenigstens ein paar Lieder anzuhören und sich die Lieder anzuhören.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_707.wav", "doc_id": "oaOHnMCwad.seg_707", "src_text": "So now we're better equipped to answer who do NLP datasets and models align with the most.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sind wir in der Lage, die Modelle mit den meisten Daten zu beantworten. Wir stellen fest, dass es sich", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_784.wav", "doc_id": "WTTtiRKFZI.seg_784", "src_text": "Here loves to all conjuncts separately: Lisa, Bart, and Maggie.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "von der Regierungsstruktur geliebt werden, zu allen Konjunktionen separat, die sich auf die Konjunktion beziehen.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_502.wav", "doc_id": "dvGkKzmIaN.seg_502", "src_text": "Are you copying my model?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Werbevideo über Papier zu machen,", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_245.wav", "doc_id": "oYCKgTzTDy.seg_245", "src_text": "And we evaluate on mT5 and XLM-R + PTR on multilingual setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "MT5 und XLMRPDR auf einer multilingualen Einstellung", "score": 38.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_58.wav", "doc_id": "TVCREhgqUP.seg_58", "src_text": "Naive seq2seq models struggle with this kind of out-of-distribution generalization and often produce outputs that are detached from the input.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Sequenz- zu Sequenz-Modelle kämpfen mit dieser Art von Verteilungsverallgemeinerung und produzieren oft Ausgaben, die vom Input abgezogen werden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_735.wav", "doc_id": "XejEJmgUmE.seg_735", "src_text": "And in this, minimal pair paradigm, the typical way to evaluate language models is that you show like an acceptable sentence or a grammatical sentence and then you show an acceptable sentence or an ungrammatical sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und in diesem minimalen Paradigma ist die typische Art und Weise, Sprachmodelle zu bewerten, dass man einen akzeptablen Satz oder einen grammatikalischen Satz zeigt und dann einen unakzeptablen Satz oder einen ungrammatischen Satz zeigt. Und", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_604.wav", "doc_id": "oeooqChmKK.seg_604", "src_text": "After a long day at work deciding cases in a law court, he was happy to relax.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "nachdem sie einen langen Tag im Gerichtsverfahren verbracht hatten, und waren froh, sich zu entspannen.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_35.wav", "doc_id": "aQpIWggfCo.seg_35", "src_text": "In total, we generate 55,000 specific goals with scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Insgesamt generieren wir 55.000 spezifische Ziele mit Skripten,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_383.wav", "doc_id": "WBLMIsdIrq.seg_383", "src_text": "Hello, my name is Kayo Yin and I will be presenting our work titled \"When Does Translation Require Context?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hallo, mein Name ist Chiao-ying und ich werde unsere Arbeit mit dem Titel 'When does translation require context", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_870.wav", "doc_id": "GvEBWkLmuI.seg_870", "src_text": "And while it sounds positive at first glance, there's been work showing that this kind of archetype actually is very harmful because it puts a lot of pressure on these demographics to be resilient and strong against societal obstacles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und der auf den ersten Blick positiv klingt. Es hat sich gezeigt, dass diese Art von Archetyp sehr schädlich ist, da er eine Menge Druck auf diese Demografien ausübt, um widerstandsfähig und stark gegen gesellschaftliche Hindernisse zu sein.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_475.wav", "doc_id": "SUkmfOTvGi.seg_475", "src_text": "And last but not least, we calculated the percentage change in F1 to assess the generalization of each model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "haben wir den Prozentsatz des F1-Wertes berechnet, um die Generalisierung jedes Modells zu bewerten.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_184.wav", "doc_id": "SLpqvupgvW.seg_184", "src_text": "The second one, which is the alternative question is generated as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die zweite, die alternative Frage, wird wie folgt generiert.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_272.wav", "doc_id": "PIZEXUFLAR.seg_272", "src_text": "Here we present MultiInstruct, the first multi-modal instruction tuning benchmark dataset that consists of 62 diverse multi-modal tasks covering 10 broad categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier stellen wir MultiInstruct, den ersten multiModal Instruction Tuning Benchmark vor, der aus zweiundsechzig verschiedenen multiModal Aufgaben besteht, die zehn Kategorien abdecken.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_299.wav", "doc_id": "PIZEXUFLAR.seg_299", "src_text": "As we can see, using more instructions can improve the model's overall performance and reduce its sensitivity a lot.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wir sehen können, kann die Verwendung mehrerer Anweisungen die Gesamtleistung des Modells verbessern und seine Empfindlichkeit stark verringern.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_432.wav", "doc_id": "hgIDlKNiFM.seg_432", "src_text": "In this presentation, we first talk about language modeling in healthcare.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In dieser Präsentation sprechen wir zunächst über die Modellierung von Sprachen im Gesundheitswesen, dann", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_855.wav", "doc_id": "GvEBWkLmuI.seg_855", "src_text": "So first we use a lexicon of stereotypes, and we find that the generated personas contain a lot more stereotypes than the human-written ones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "verwenden Sie zuerst Stereotypen und wir finden, dass die generierten Personen viel mehr Stereotypen enthalten als die menschlichen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_539.wav", "doc_id": "dvGkKzmIaN.seg_539", "src_text": "The results on four data sets show that our embedding marker can have great detection performance while keep great utility for downstream tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Ergebnisse der Datensätze zeigen, dass unser eingebetteter Marker eine gute Erkennungsleistung haben kann und gleichzeitig eine gute Nutzbarkeit für Down-Screen-Aufgaben beibehält.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_145.wav", "doc_id": "wLqFAuDnKa.seg_145", "src_text": "So it's important to select the examples from high-quality translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist also wichtig, die Beispiele aus hochwertigen Übersetzungen auszuwählen,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_30.wav", "doc_id": "aQpIWggfCo.seg_30", "src_text": "Since large language models are costly to deploy, it's essential to enable language planning ability of smaller and specialized models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Da die Bereitstellung großer Sprachmodelle teuer ist, ist es unerlässlich, Sprachplanung für kleinere und spezialisierte Modelle zu ermöglichen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_695.wav", "doc_id": "oaOHnMCwad.seg_695", "src_text": "And we ought to do this over looking at the demographics of original data sets annotators, because, usually only a few annotators annotate each instance and because demographics are rarely collected and shared.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir werden das über die Betrachtung der Demografien der ursprünglichen Datensätze sehen, weil normalerweise nur wenige Annotatoren vorkommen und weil die Demografien wirklich gesammelt und geteilt", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_549.wav", "doc_id": "rISrKoXQCx.seg_549", "src_text": "Political news media are well covered in their pretraining data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Politische Nachrichtenmedien sind in ihren Vorbereitungsdaten enthalten. Laut", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_60.wav", "doc_id": "TVCREhgqUP.seg_60", "src_text": "A popular method to address this is to integrate trees into the models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Eine beliebte Methode zur Adressierung dieser ist die Integration von Trees in den Modellen.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_180.wav", "doc_id": "SLpqvupgvW.seg_180", "src_text": "Which is the alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das ist die alternative", "score": 31.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_111.wav", "doc_id": "uZBWfYjYnf.seg_111", "src_text": "If we look at the main results of EDAtt, we'll plot the simultaneous speech translation results on graphs in which we have BLEU on one side that measures the translation quality, and average lagging that is the latency measure, and we also consider the computational aware average lagging that accounts for the model's computational times to predict the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn Sie sich die wichtigsten Ergebnisse ansehen. Wir zeichnen die simultanen Übersetzungsresultate auf Graphen, auf denen wir einen blauen Punkt auf der einen Seite haben, der die Übersetzungsqualität misst und einen Durchschnittswert angibt. Das ist die Latenzzeit und wir betrachten auch den computergestützten durchschnittlichen Fehler, der für die Modelle berechnete Ausgabewerte vorhersagt.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_88.wav", "doc_id": "TVCREhgqUP.seg_88", "src_text": "Our permutation method is very flexible, but it brings the challenge that finding the highest-scoring permutation is NP-hard.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unsere Permutationsmethode ist sehr flexibel, aber sie bringt die Herausforderung mit sich, dass das Finden der höchstpunktreichen Permutation npe schwer ist. Das liegt daran,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_426.wav", "doc_id": "WBLMIsdIrq.seg_426", "src_text": "So this sort of suggests where we would need to see more progress for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wie Ellipsen, Pronomen und Verbform. Hier müssten wir mehr Fortschritte bei der Dokumenten-Ebene-Übersetzung sehen.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_132.wav", "doc_id": "wLqFAuDnKa.seg_132", "src_text": "Finally, we provide some recommendations for prompt selection strategies.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "geben wir einige Empfehlungen für schnelle Auswahlstrategien. Das", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_331.wav", "doc_id": "dJGfOSFgZO.seg_331", "src_text": "On the other hand, the combination of all turn-level Likert metrics explains far less of the quality, and fewer of these metrics carry unique information.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "der anderen Seite erklärt die Kombination aller Lickert-Metriken weit weniger von der Qualität, und wenige dieser Metriken tragen einzigartige Informationen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_613.wav", "doc_id": "oeooqChmKK.seg_613", "src_text": "Second, there's a \"Background-Both\" setting, where background knowledge is available both at pretrain time and inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zweitens gibt es die Hintergrundausrichtung, bei der das Hintergrundwissen sowohl in der Vorbereitungszeit als auch in der Interventionszeit verfügbar ist.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_53.wav", "doc_id": "TVCREhgqUP.seg_53", "src_text": "In this case, \"The girl slept.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In diesem Fall schlief das Mädchen und", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_390.wav", "doc_id": "WBLMIsdIrq.seg_390", "src_text": "So, depending on context, the meaning of the word changes, and therefore its translation changes as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "So ist die Bedeutung des Wortes im Kontext zu verstehen und", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_176.wav", "doc_id": "SLpqvupgvW.seg_176", "src_text": "The cartoon has three speech bubbles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der Cartoon hat drei Sprechblasen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_324.wav", "doc_id": "dJGfOSFgZO.seg_324", "src_text": "For comparison, we also evaluated these conversations using three existing methods: Likert ratings on the turn-level, Likert ratings on the dialogue-level, and dialogue-level pairwise comparisons.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Vergleich bewerteten wir diese Gespräche auch mit drei bestehenden Methoden: Lickert-Bewertungen auf der Turn-Ebene, Lickert-Bewertungen auf der Dialogebene und Dialogebene-Paarvergleiche.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_588.wav", "doc_id": "rISrKoXQCx.seg_588", "src_text": "Thank you for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "habe, danke für Ihre Zeit.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_354.wav", "doc_id": "gGbuDbHhyc.seg_354", "src_text": "The aforementioned doubt is asked to ask three research questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Der oben erwähnte Zweifel veranlasst uns, drei Forschungsfragen zu stellen:", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_441.wav", "doc_id": "hgIDlKNiFM.seg_441", "src_text": "However, French didn't have any open source model for biomedical until now.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "neues Open-Source-Modell für Bio-Medical. Wir stellen also selbst die Frage,", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_195.wav", "doc_id": "SLpqvupgvW.seg_195", "src_text": "When we show this alternative question to the annotators, they know the name of these entities, but they don't necessarily know about the entities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn wir den Anwälten diese alternative Frage stellen, kennen sie den Namen dieser Entitäten, aber sie wissen nicht unbedingt etwas über die Entitäten.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_195.wav", "doc_id": "SLpqvupgvW.seg_195", "src_text": "When we show this alternative question to the annotators, they know the name of these entities, but they don't necessarily know about the entities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "alternativen Fragen zu den Antragstellern stellen, wissen sie den Namen dieser Entitäten, aber sie wissen nicht unbedingt etwas über diese Entitäten.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_329.wav", "doc_id": "dJGfOSFgZO.seg_329", "src_text": "Finally, we checked whether each evaluation metric captures a unique aspect of chat quality using a stepwise linear regression.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Schließlich prüfen wir, ob jede Beurteilungsmetrik einen einzigartigen Aspekt der Prüfungskriterien erfasst, indem wir eine Schrittweise lineare Regression verwenden. Sie", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_763.wav", "doc_id": "XejEJmgUmE.seg_763", "src_text": "So we did a series of analysis where we tried to perturb the input sentence by, trying to preserve the relevant structure but adding like noise to the input.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir führten eine Reihe von Analysen durch, bei denen wir versuchten, den Eingabesatz zu stören, um die relevante Struktur zu bewahren, aber Rauschen zum Eingang hinzuzufügen", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_712.wav", "doc_id": "oaOHnMCwad.seg_712", "src_text": "We also find most additional alignment with people who have a college education.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir finden auch, dass wir die meisten zusätzlichen Verbindungen mit Personen haben, die", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_235.wav", "doc_id": "oYCKgTzTDy.seg_235", "src_text": "And we test Multilingual Model which we train one multilingual model for all languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Es hat ein mehrsprachiges Modell, mit dem wir ein mehrsprachiges Modell für alle Sprachen trainieren.", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_543.wav", "doc_id": "dvGkKzmIaN.seg_543", "src_text": "That's all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Das ist alles.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_147.wav", "doc_id": "wLqFAuDnKa.seg_147", "src_text": "The dev data is much more curated, and with higher quality than the training data, that it's more noisy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Daten sind viel besser qualifiziert, und mit hoher Qualität sind", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_755.wav", "doc_id": "XejEJmgUmE.seg_755", "src_text": "We increase the context length toward up to 1024 for to max out OPT and GPT 2 models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir haben die Kontextlänge auf 2002 erhöht, um die ODT- und GPT-Modelle zu maximieren, und", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_293.wav", "doc_id": "PIZEXUFLAR.seg_293", "src_text": "Here is our main result.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier ist unser Hauptergebnis,", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_57.wav", "doc_id": "TVCREhgqUP.seg_57", "src_text": "In this example, the model has seen shallow recursion during training and is tested on an example with deeper recursion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In diesem Beispiel hat das Modell während der Ausbildung eine geringere Rekursion gesehen und wird auf einem Beispiel mit tieferer Rekursion getestet.", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_215.wav", "doc_id": "oYCKgTzTDy.seg_215", "src_text": "Hello everyone, my name is Yusen Zhang from the Penn State University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hallo, ich bin Yuchen John von der Penn State University.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_721.wav", "doc_id": "oaOHnMCwad.seg_721", "src_text": "Our third recommendation is to build specialised datasets and models within 4 specific communities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unsere dritte Empfehlung besteht darin, spezialisierte Datensätze und Modelle innerhalb von vier spezifischen Gemeinschaften", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_135.wav", "doc_id": "wLqFAuDnKa.seg_135", "src_text": "The difference observed is of more than one BLEURT points.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die Differenz von Sur ist von mehr als einem Blur-Punkt.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_32.wav", "doc_id": "aQpIWggfCo.seg_32", "src_text": "However, previous studies do not enable planning for specific goals and manual dataset annotation is expensive.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ist es jedoch nicht möglich, für bestimmte Ziele zu planen, und die manuelle Anmerkung des Datensatzes ist teuer.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_328.wav", "doc_id": "dJGfOSFgZO.seg_328", "src_text": "For example, you can see how measuring the proportion of turns with self and partner contradictions explains 5% and 10% of conversation quality, respectively, while the average Likert consistency scores explain only 4% or less.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wird. Zum Beispiel können Sie sehen, wie die Messung der Proportion von Turns mit Selbst- und Partnerwidersprüchen fünf Prozent bzw. zehn Prozent der Gesprächsqualität erklärt, während die durchschnittlichen Likert-Konsistenzwerte nur vier Prozent oder weniger erklären.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_133.wav", "doc_id": "wLqFAuDnKa.seg_133", "src_text": "The prompting has a big influence on the performance of the LLMs for translation, as we can see in a simple experiment, where we used one-shot prompting and provided two different prompts for each sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "hat einen großen Einfluss auf die Leistung der ELDs für die Übersetzung, wie wir in einem einfachen Experiment sehen, in dem wir eine einmalige Stimulierung verwendeten und zwei verschiedene Stimulierungen für eine Satz verliehen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_504.wav", "doc_id": "dvGkKzmIaN.seg_504", "src_text": "Let's first introduce the background about embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "zu schützen. Lassen Sie uns", "score": 11.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_793.wav", "doc_id": "WTTtiRKFZI.seg_793", "src_text": "Because then it can be moved to the position after the adjunct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "in die Position nach dem Direkts bewegt", "score": 8.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_302.wav", "doc_id": "PIZEXUFLAR.seg_302", "src_text": "We also can see transfer learning from natural instruction datasets can help OFA to attain much better performance on the natural instruct dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir können auch sehen, dass die Übertragung von Naturlehrdatensätzen auf den Naturlehrdatensatz OVA helfen kann, eine viel bessere Leistung auf dem Naturlehrdatensatz zu erzielen.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_682.wav", "doc_id": "oaOHnMCwad.seg_682", "src_text": "This is an example of a design bias where we see systematic performance differences of technology between populations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist ein Beispiel für die Design-Bias, bei dem wir die systematischen Leistungsunterschiede der Technologie zwischen den Bevölkerungsgruppen sehen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_707.wav", "doc_id": "oaOHnMCwad.seg_707", "src_text": "So now we're better equipped to answer who do NLP datasets and models align with the most.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ländern. Daher werden wir nun besser ausgerüstet, um zu beantworten, wer die NLP-Daten modelliert, die mit den meisten übereinstimmen, und wir stellen", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_651.wav", "doc_id": "FLkGnzVRew.seg_651", "src_text": "Given the low occurrence of dissonance and absence of any prior such data set, we are facing the problem of absolute rarity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Anzahl von Abwesenheiten und der Abwesenheit jeder vorherigen Datenmenge stehen wir vor dem Problem der absoluten Häufigkeit.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_225.wav", "doc_id": "oYCKgTzTDy.seg_225", "src_text": "So to this end we propose XSemPLR.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Galizisch, Baskisch, Welsh, Bretonisch, Cornisch,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_772.wav", "doc_id": "WTTtiRKFZI.seg_772", "src_text": "As you may know, there are different dependency structures assumed by different theories and corpus approaches.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Siehe die unterschiedlichen Abhängigkeitsstrukturen, die von verschiedenen Theorien und Konzepten ausgehen, beispielsweise", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_239.wav", "doc_id": "oYCKgTzTDy.seg_239", "src_text": "We train on one source language and transfer to another language.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir trainieren auf einer Quellensprache und übertragen sie auf eine andere Sprache,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_692.wav", "doc_id": "oaOHnMCwad.seg_692", "src_text": "We do this through our framework NLPositionality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir machen das über unser Framework und unsere Positionalität.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_579.wav", "doc_id": "rISrKoXQCx.seg_579", "src_text": "So a little bit of discussion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "müssen.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_543.wav", "doc_id": "dvGkKzmIaN.seg_543", "src_text": "That's all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Das ist alles.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_759.wav", "doc_id": "XejEJmgUmE.seg_759", "src_text": "And there we see that the MPP judgments either increase or decrease significantly when you add either acceptable prefixes or unacceptable prefixes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und dort sehen wir, dass die MP-Beurteilungen entweder steigen oder sich erheblich verringern, wenn man entweder akzeptable oder unakzeptable Präfixe hinzufügt.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_818.wav", "doc_id": "WTTtiRKFZI.seg_818", "src_text": "In such cases, the left conjunct prefers to be shorter; the most of the biggest difference between the two conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In solchen Fällen sollte die Verbindung eher kurz als lang sein. Allerdings, wenn die Regierung", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_656.wav", "doc_id": "FLkGnzVRew.seg_656", "src_text": "Further, on iteratively fine-tuning on both tasks, we find that fine-tuning of CE tasks followed by further fine-tuning on debate yields a much better zero-shot performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "C Point 6. Weiterhin bei der iterativen Feinabstimmung auf beide Aufgaben stellen wir fest, dass die Feinabstimmung von CE-Aufgaben, gefolgt von einer weiteren Feinabstimmung auf der Debatte, eine viel bessere Leistung mit Null-Schuss-Performances", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_273.wav", "doc_id": "PIZEXUFLAR.seg_273", "src_text": "These tasks are derived from 21 existing open-source dataset and each task is equipped with five expert written instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese Aufgaben basieren auf 21 bestehenden Open-Source-Datensätzen, und jede Aufgabe ist mit fünf Anweisungen ausgestattet. Für die Untersuchung", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_495.wav", "doc_id": "SUkmfOTvGi.seg_495", "src_text": "So going back to the question that we raised in the title of our paper Do CoNLL-2003 taggers still work in 2023?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wird. Also, zurück zu der Frage, die wir in der Überschrift unserer Arbeit aufgeworfen haben: Funktionieren die Korn 2003-Tags in 2023 noch? Und", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_846.wav", "doc_id": "GvEBWkLmuI.seg_846", "src_text": "The second part is marked words, which is a method to identify the words that distinguish marked groups from unmarked ones, which I'll elaborate on shortly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der zweite Teil ist „Mark Words“, eine Methode, um die Wörter zu identifizieren, die Mark-Gruppen von Unmark-Gruppen unterscheiden.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_395.wav", "doc_id": "WBLMIsdIrq.seg_395", "src_text": "First, when does translation require context?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Erstens, wann ist die Übersetzung erforderlich,", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_417.wav", "doc_id": "WBLMIsdIrq.seg_417", "src_text": "We can then also note that different languages have different proportions of these discourse phenomena.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "festgestellt werden, dass verschiedene Sprachen unterschiedliche Proportionen dieser Phänomene haben.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_722.wav", "doc_id": "oaOHnMCwad.seg_722", "src_text": "And a good example of this is the Masakhani initiative.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "zu erstellen, und ein gutes Beispiel hierfür ist die Masakani-Initiative.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_328.wav", "doc_id": "dJGfOSFgZO.seg_328", "src_text": "For example, you can see how measuring the proportion of turns with self and partner contradictions explains 5% and 10% of conversation quality, respectively, while the average Likert consistency scores explain only 4% or less.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sich selbst und den Partnern abgebildet werden, was fünf Prozent und zehn Prozent der Gesprächsqualität ausmacht, während die Durchschnittswerte nur vier Prozent ausmachen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_43.wav", "doc_id": "aQpIWggfCo.seg_43", "src_text": "We use large language models to generate a high-quality script dataset, CoScript, for constrained language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir verwenden große Sprachmodelle, um einen hochwertigen Datensatz für die Sprachplanung zu generieren.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_562.wav", "doc_id": "rISrKoXQCx.seg_562", "src_text": "So we could conduct a controlled experiment by further pretraining language model checkpoints on 6 different partisan corpora separated into news and social media, further divided into their political leaning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "werden, also können wir einen kontrollierten Versuch durchführen, indem wir Sprachen weiter trainieren. Modellprüfpunkte auf sechs verschiedenen Parteien und Korpora, getrennt in Nachrichten und sozialen Medien, und weiter getrennt in ihre politischen Linien,", "score": 41.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_845.wav", "doc_id": "GvEBWkLmuI.seg_845", "src_text": "And also this enables direct comparison between our generated personas and the human written responses.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ies ermöglicht auch eine direkte Vergleichbarkeit zwischen unseren generierten Persönlichkeiten und den menschlich verfassten Antworten. Die", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_454.wav", "doc_id": "hgIDlKNiFM.seg_454", "src_text": "However, we can observe that data from heterogeneous sources appear to be more versatile.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "abschneidet. Allerdings können wir beobachten, dass Daten aus heterogenen Quellen zu sein scheinen,", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_194.wav", "doc_id": "SLpqvupgvW.seg_194", "src_text": "For example, the same genre or the same artist for a song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "zum Beispiel das gleiche Genre oder den gleichen Künstler. Wenn wir diese", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_687.wav", "doc_id": "oaOHnMCwad.seg_687", "src_text": "And so one question that people might ask is, do datasets and models have positionality?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und so eine Frage, die die Leute vielleicht stellen, ist: Haben Datensätze und Modelle Positionalität?", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_266.wav", "doc_id": "PIZEXUFLAR.seg_266", "src_text": "However, most previous works on instruction tuning focused on improving the zero-shot performance on language only tasks, while computer vision and multi-modal tasks have been left out.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die meisten früheren Arbeiten zur Anpassung der Sprache konzentrierten sich jedoch auf die Verbesserung der Leistung auf Sprachaufgaben, wobei Computer-Vision und Multimodal-Aufgaben außer Acht gelassen wurden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_208.wav", "doc_id": "SLpqvupgvW.seg_208", "src_text": "But this is not realistic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "das ist nicht realistisch.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_475.wav", "doc_id": "SUkmfOTvGi.seg_475", "src_text": "And last but not least, we calculated the percentage change in F1 to assess the generalization of each model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "nicht zuletzt, haben wir den Prozentsatzänderung in F1 berechnet, um die Verallgemeinerung jedes Modells zu bewerten.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_326.wav", "doc_id": "dJGfOSFgZO.seg_326", "src_text": "From our analysis of these evaluation results, we found that ABC-Eval behavior labels are overall more reliable than labels collected by existing methods, as measured by inter-annotator agreement on 100 doubly-labeled conversations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Aus den Analysen dieser Bewertungsergebnisse stellen wir fest, dass die ABC-Verhaltenslabels im Durchschnitt zweimal so zuverlässig sind wie die Labels, die durch existierende Methoden ermittelt werden. Zusätzlich sind die A. B. C. Labels eher vorhersagbar für die Gesamtkonversationsqualität im Vergleich", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_501.wav", "doc_id": "dvGkKzmIaN.seg_501", "src_text": "It's my pleasure to give a short advertisement video of our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Es ist mir eine Freude, ein kurzes", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_240.wav", "doc_id": "oYCKgTzTDy.seg_240", "src_text": "So during training, we train it on English queries or the combination of English and German Few-shot queries to train a multilingual model to predict the SQL output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Während des Trainings werde ich eine englische Frage oder eine Kombination aus englischen und deutschen Kurzfragen trainieren, um ein mehrsprachiges Modell zu trainieren, das den Ausgang vorhersagt.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_210.wav", "doc_id": "SLpqvupgvW.seg_210", "src_text": "For example, when the language model retrieves the background knowledge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zum Beispiel, wenn das Sprachmodell das Hintergrundwissen abruft,", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_718.wav", "doc_id": "oaOHnMCwad.seg_718", "src_text": "So we have a few recommendations for this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Empfehlungen für dieses, die erste", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_243.wav", "doc_id": "oYCKgTzTDy.seg_243", "src_text": "And, we also evaluate Encoder-Decoder models, which is Multilingual Pretrained Encoder-Decoder Models, such as mBART and mT5.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir bewerten auch Encoder-Decoder-Modelle, die mehrsprachig trainiert sind, wie z. B. Bart und MT-Five.", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_590.wav", "doc_id": "oeooqChmKK.seg_590", "src_text": "This work is a collaboration between McGill University, Mila, and Microsoft Research.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Arbeit ist eine Zusammenarbeit zwischen der McGill University, dem Mila und Microsoft Research.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_149.wav", "doc_id": "wLqFAuDnKa.seg_149", "src_text": "Nevertheless, specialized state-of-the-art systems have a substantial advantage over the PaLM translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dennoch haben spezialisierte, hochmoderne Systeme einen erheblichen Vorteil gegenüber den", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_492.wav", "doc_id": "SUkmfOTvGi.seg_492", "src_text": "Our conclusion is that, for good generalization we would need a better model architecture, larger model size, as well as more fine tuning examples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Für eine gute Generalisierung bräuchten wir eine bessere Modellarchitektur, eine größere Modellgröße und mehr Feintuning-Beispiele, und", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_688.wav", "doc_id": "oaOHnMCwad.seg_688", "src_text": "And we're not trying to say that models themselves in data sets themselves have demographic identities and life experiences, but they do aggregate judgments and opinions of real people, and can thus represent certain positionalities over others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir werden nicht versuchen, zu sagen, dass die Modelle und Daten, die wir haben, demografische Identitäten und Lebenserfahrungen haben, aber sie können bestimmte Positionen von anderen repräsentieren.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_130.wav", "doc_id": "wLqFAuDnKa.seg_130", "src_text": "And we compared to state-of-the-art systems, so the best performing system, so the WMT evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir vergleichen zwei Arten von Systemen: die besten Performancesysteme und die dB-Bewertung.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_680.wav", "doc_id": "oaOHnMCwad.seg_680", "src_text": "But that's not really the case for Aditya Sharma.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Aber das ist nicht wirklich der Fall für Aditya Sharma, wobei", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_232.wav", "doc_id": "oYCKgTzTDy.seg_232", "src_text": "And we'll also test Monolingual Model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir werden auch das Monolinguismus-Modell testen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_637.wav", "doc_id": "FLkGnzVRew.seg_637", "src_text": "Further mentioning that \"I don't think I could keep my job without them\" justifies the second occurrence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ein Widerspruch. Ich möchte noch hinzufügen, dass ich nicht denke, dass ich meinen Job ohne sie behalten könnte. Sie haben zweite Akkorde", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_426.wav", "doc_id": "WBLMIsdIrq.seg_426", "src_text": "So this sort of suggests where we would need to see more progress for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "andere Phänomene wie Ellipsen und Form verwendet werden. Also werden wir für die Dokumententransformation mehr Fortschritte machen müssen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_714.wav", "doc_id": "oaOHnMCwad.seg_714", "src_text": "However, when models and data sets are aligned to specific populations, some are inevitably left behind.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wenn Modelle und Datensätze spezifischen Populationen zugeordnet werden, bleiben einige zwangsläufig zurück.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_469.wav", "doc_id": "SUkmfOTvGi.seg_469", "src_text": "And when we develop new taggers, what is needed for good generalization?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wenn wir neue Tags entwickeln, was brauchen wir für eine gute Generalisierung? Gleichzeitig,", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_390.wav", "doc_id": "WBLMIsdIrq.seg_390", "src_text": "So, depending on context, the meaning of the word changes, and therefore its translation changes as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Daher ändert sich die Bedeutung des Wortes je nach Kontext, und die", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_630.wav", "doc_id": "oeooqChmKK.seg_630", "src_text": "If you're interested in more details, please see our paper and check out the data set and code on GitHub.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn Sie nach mehr Details suchen, sehen Sie sich bitte unser Papier an und überprüfen Sie den Datensatz und den Code auf GitHub.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_750.wav", "doc_id": "XejEJmgUmE.seg_750", "src_text": "And we can do the same for unacceptability case.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wir können dasselbe für Unzulänglichkeitsfälle tun.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_471.wav", "doc_id": "SUkmfOTvGi.seg_471", "src_text": "To investigate these problems, we developed the CoNLL++ Dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um diese Probleme zu untersuchen, haben wir das KERN Datensatz", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_779.wav", "doc_id": "WTTtiRKFZI.seg_779", "src_text": "Now those are asymmetric approaches to coordinate structures, such as the Prague approach.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wie den Prag-Ansatz, die", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_271.wav", "doc_id": "PIZEXUFLAR.seg_271", "src_text": "Therefore, this motivates us to build a multi-modal instruction tuning dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "uns dies, einen multimodalen Anweisungsdatensatz zu erstellen.", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_5.wav", "doc_id": "aQpIWggfCo.seg_5", "src_text": "However, previous work mainly focuses on planning for the abstract goals of stereotypical activities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Allerdings konzentriert sich die vorherige Arbeit hauptsächlich auf die Planung für die abstrakten Ziele stereotypischer Aktivitäten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_26.wav", "doc_id": "aQpIWggfCo.seg_26", "src_text": "In addition, we reward the script that contains the keywords of the target constraint.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Darüber hinaus vermeiden wir Skripte, die die Schlüsselwörter der Zielbeschränkung enthalten.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_168.wav", "doc_id": "SLpqvupgvW.seg_168", "src_text": "This could happen when the user cannot remember the name of the song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wenn der Benutzer den Namen des Programms nicht mehr", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_567.wav", "doc_id": "rISrKoXQCx.seg_567", "src_text": "We separately pretrain language models on the two different temporal corpora.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "in zwei verschiedene temporale Korpora, wobei wir jeweils Sprachmodelle auf zwei verschiedene temporale Korpora verwenden.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_715.wav", "doc_id": "oaOHnMCwad.seg_715", "src_text": "An example of this is that datasets and models are less aligned to non binary people compared to the men and women counterparts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ein Beispiel dafür ist, dass die Modelle der Daten weniger mit nicht binären Personen abgepasst sind, im Vergleich zu den männlichen und weiblichen Gegenüber.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_836.wav", "doc_id": "GvEBWkLmuI.seg_836", "src_text": "Describe yourself.\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "verwendet.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_700.wav", "doc_id": "oaOHnMCwad.seg_700", "src_text": "Compared to the platforms like M Turk which largely have participants from the US or India and further Lab in the Wild still is able to get high quality data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "mit den Plattformen „mturk“ aus den USA und Indien, und wir können weiterhin hochwertige Daten erhalten.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_425.wav", "doc_id": "WBLMIsdIrq.seg_425", "src_text": "But these models are not much better than models that do not use context on other phenomena like ellipsis, pronouns, and verb form.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sind jedoch nicht viel besser als die Modelle, bei denen keine Kontakte auf", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_137.wav", "doc_id": "wLqFAuDnKa.seg_137", "src_text": "So, it's important to select a good prompting strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "betragen, also ist es wichtig, eine gute Prompt-Strategie auszuwählen. 1.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_17.wav", "doc_id": "aQpIWggfCo.seg_17", "src_text": "Results in the figure show that the semantic completeness in generated scripts is acceptable but the faithfulness to the constraints cannot be guaranteed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "in den Abbildungen zeigen, dass die semantische Vollständigkeit in generierten Skripten akzeptabel ist, aber die Treue gegenüber den Einschränkungen kann nicht garantiert werden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_523.wav", "doc_id": "dvGkKzmIaN.seg_523", "src_text": "The trigger set is a group of words in a moderate frequency interval.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das Triggerset ist eine Gruppe von Wörtern in einem moderaten Frequenzintervall.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_136.wav", "doc_id": "wLqFAuDnKa.seg_136", "src_text": "And this can go, in extreme cases, up to 40 BLEURT points.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Point. Und dies kann in Extremsituationen bis zu vierzig Punkte", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_553.wav", "doc_id": "rISrKoXQCx.seg_553", "src_text": "On the other hand, these different political opinions are inherently socially biased and might lead to potential fairness issues in downstream task applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "der anderen Seite sind dies unterschiedliche politische Meinungen, die soziale Ungerechtigkeiten und potenzielle Ungerechtigkeiten in Antragsverfahren darstellen. Dazu", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_824.wav", "doc_id": "WTTtiRKFZI.seg_824", "src_text": "And we show in the paper how this provides an argument against asymmetric structures of coordination, as these two, and for the symmetric structures, as these two.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir zeigen in dem Papier, wie dies ein Argument gegen asymmetrische Koordinationsstrukturen und für asynchratische Strukturen ist. Schauen Sie sich", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_253.wav", "doc_id": "oYCKgTzTDy.seg_253", "src_text": "We found that, by comparing the green and orange line, we found the Zero-shot setting, the Cross-lingual transfer performance gap is significant, and then comparing the blue and orange lines, we found that with the Few-shot setting the transfer gap is shortened rapidly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Durch den Vergleich der grünen und orangen Linie stellten wir fest, dass bei der Einstellung auf null Schüsse die Übertragungsleistung erheblich geringer ist, und durch den Vergleich der blauen und orangen Linie stellten wir fest, dass bei wenigen Schüssen die Übertragungsleistung rasch abnimmt.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_296.wav", "doc_id": "PIZEXUFLAR.seg_296", "src_text": "Here we can see, as the amount of task increases, the model achieves better performance and in the meantime, lower sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier können wir sehen, wie sich die Anzahl der Aufgaben erhöht und das Modell bessere Leistung und in der Zwischenzeit geringere Empfindlichkeit erreicht.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_55.wav", "doc_id": "TVCREhgqUP.seg_55", "src_text": "These utterances are paired with logical forms that represent core aspects of their meaning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die in den logischen Formen stehen, das ist der korrekte Aspekt ihrer Bedeutung.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_689.wav", "doc_id": "oaOHnMCwad.seg_689", "src_text": "So prior work has suggested some anecdotal evidence of having positionality, such as cultural gaps and models and data sets, as well as theoretical definitions of model positionality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "haben die Prüfer einige anekdotische Beweise für die Positionalität geliefert, wie z. B. kulturelle Lücken und Datenmodelle, als die definitiven Definitionen des Modells.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_543.wav", "doc_id": "dvGkKzmIaN.seg_543", "src_text": "That's all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das ist alles, vielen", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_579.wav", "doc_id": "rISrKoXQCx.seg_579", "src_text": "So a little bit of discussion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ergeben, informieren können. In der Diskussion", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_874.wav", "doc_id": "GvEBWkLmuI.seg_874", "src_text": "First, we should, as researchers, be addressing positive stereotypes and essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Erstens sollten wir als Forscher positive Stereotypen und die Essenz von Erzählungen ansprechen.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_532.wav", "doc_id": "dvGkKzmIaN.seg_532", "src_text": "Back door data set contains sentences of which all words belong to the trigger set while all words in the sentences of benign data set do not belong to the trigger sets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der Hintertür-Datensatz enthält Sätze, bei denen alle Wörter zum Auslösersatz gehören, während alle Wörter in den Sätzen des harmlosen Datensatzes nicht zum Auslösersatz gehören.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_140.wav", "doc_id": "wLqFAuDnKa.seg_140", "src_text": "We saw that the actual form of the prompting doesn't have a big influence in the case of several short promptings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir sahen, dass die tatsächliche Form der Aufforderung im Fall der Serien-Kurz-Aufforderung keinen großen Einfluss hat. Es ist für Null und ein", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_435.wav", "doc_id": "hgIDlKNiFM.seg_435", "src_text": "We also introduced a comparison of models with multiple pre-training settings and data sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "mit anderen Sprachmodellen vor. Versionen von Modellen mit mehreren Plutonischen Einstellungen und Datenquellen. Dann", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_585.wav", "doc_id": "rISrKoXQCx.seg_585", "src_text": "So it's kind of like the electric trolley problem.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist es wie das elektrische Schleppproblem.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_713.wav", "doc_id": "oaOHnMCwad.seg_713", "src_text": "So for GPT 4, in the social acceptability task, we find that it's most aligned to people with a college education or Graduate School education and we find the same for Dynahate where it's most aligned to people with a college education.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wir für GPD in der Sozialkompetenz die meisten Angaben zu Personen mit Hochschul- oder Hochschulbildung finden. Und wir finden das Gleiche für Dandyheights, wo es sich am meisten auf Menschen mit Hochschulbildung bezieht. Allerdings,", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_665.wav", "doc_id": "FLkGnzVRew.seg_665", "src_text": "On further rounds of AL with two best strategies, we improve dissonance classification AUC to 0.75, which is the best performance that we have on the task so far.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Nach weiteren Runden von AL mit zwei besten Strategien verbesserten wir die Diskriminierung von AUC auf 0,75, was die beste Leistung ist, die wir bis jetzt auf den Aufgaben erreicht haben.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_415.wav", "doc_id": "WBLMIsdIrq.seg_415", "src_text": "For each of the five discourse phenomena we identified, we create taggers to automatically identify words that pertain to the phenomenon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Für jedes der fünf identifizierten Diskursphänomene erstellen wir Tags, um Wörter automatisch zu identifizieren, die zum Phänomen gehören,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_45.wav", "doc_id": "aQpIWggfCo.seg_45", "src_text": "Thanks for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank für Ihre Zeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_463.wav", "doc_id": "SUkmfOTvGi.seg_463", "src_text": "Hello everyone, my name is Shuheng.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo alle, mein Name ist Shu-Hung;", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_320.wav", "doc_id": "dJGfOSFgZO.seg_320", "src_text": "We developed this method to comprehensively cover chat model behaviors that have been suggested to affect chat quality in recent literature.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben diese Methode entwickelt, um die Chat-Modellverhaltensweisen umfassend abzudecken, die in der jüngsten Literatur vorgeschlagen wurden, um die Chat-Qualität zu beeinflussen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_160.wav", "doc_id": "SLpqvupgvW.seg_160", "src_text": "I'm going to talk about our work on \"Resolving Indirect Referring Expressions for Entity Selection\", in which we introduce the AltEntities Corpus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ich möchte über unsere Arbeit zur Lösung indirekter Referenzausdrücke für die Entitätswahl sprechen, in der wir den Corpus der Alternativen entitäten einführen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_834.wav", "doc_id": "GvEBWkLmuI.seg_834", "src_text": "To overcome these limitations, we rely on the property that these newer instruction-tuned LLMs are very good at responding to instructions and prompts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sein. Wenn wir diese Einschränkungen überwinden, verlassen wir uns auf die Eigenschaft, dass diese neuen, instruktionsgeführten ELMs sehr gut auf Anweisungen und Prozesse reagieren. Wir", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_206.wav", "doc_id": "SLpqvupgvW.seg_206", "src_text": "Results with T5 XL model are summarized below.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Ergebnisse mit dem Modell T5 Large sind zusammengefasst.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_573.wav", "doc_id": "rISrKoXQCx.seg_573", "src_text": "And vice versa, right-leaning language models are better at detecting hate speech targeting white and men, however worse at detecting hate speech targeting at black LGBTQ plus and other minority communities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und umgekehrt sind rechtschreibende Sprachmodelle besser darin, weiße und männliche Personen zu erkennen, aber schlechter darin, schwarze und LGTBQ+-Personen zu erkennen.", "score": 31.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_295.wav", "doc_id": "PIZEXUFLAR.seg_295", "src_text": "Also, transfer learning from natural instruction dataset can benefit instruction tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Auch Transfer-Learning von natürlichen Anweisungs-Datensätzen kann sich auf die Anweisungs-Justierung auswirken.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_573.wav", "doc_id": "rISrKoXQCx.seg_573", "src_text": "And vice versa, right-leaning language models are better at detecting hate speech targeting white and men, however worse at detecting hate speech targeting at black LGBTQ plus and other minority communities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "im Gegenzug, bessere Sprachmodelle sind besser darin, Hassrede zu erkennen, die sich auf Weiße und Männer richtet, und auch besser darin, Hassrede zu erkennen, die sich auf schwarze LGBTQ und andere Minderheiten richtet.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_323.wav", "doc_id": "dJGfOSFgZO.seg_323", "src_text": "To determine what kind of evaluation is most effective, we selected four state-of-the-art chat models and evaluated them on 100 human-bot conversations per model using ABC-Eval.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um zu bestimmen, welche Art von Bewertung am effektivsten ist, haben wir vier State-of-the-art-Chatmodelle ausgewählt und sie auf hundert menschlichen Botenkonversationen pro Modell mit ABC-Eval bewertet.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_805.wav", "doc_id": "WTTtiRKFZI.seg_805", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ordnung:", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_757.wav", "doc_id": "XejEJmgUmE.seg_757", "src_text": "Now, what happens when we choose sentences from the same data set?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was passiert nun, wenn wir Sätze aus dem gleichen Datensatz auswählen?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_252.wav", "doc_id": "oYCKgTzTDy.seg_252", "src_text": "While the green line is the Monolingual Setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "grüne Linie im monolingualen Bereich liegt.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_620.wav", "doc_id": "oeooqChmKK.seg_620", "src_text": "In the Background-Inference setting, we provide the fictional occupation \"mirituer\" instead of politician because \"mirituer\" is unlikely to be contained in the pretrained parameters.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "zu sehen. Wir stellen den fiktiven Beruf Meritua anstelle eines Politikers dar, denn Meritua ist höchst unwahrscheinlich in einem vorbereiteten Paradigma enthalten.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_299.wav", "doc_id": "PIZEXUFLAR.seg_299", "src_text": "As we can see, using more instructions can improve the model's overall performance and reduce its sensitivity a lot.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wie wir sehen können, kann die Verwendung mehrerer Anweisungen die Gesamtleistung des Modells verbessern und seine Sensibilität stark reduzieren.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_774.wav", "doc_id": "WTTtiRKFZI.seg_774", "src_text": "So in this case, Lisa.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der erste Konjunkt", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_665.wav", "doc_id": "FLkGnzVRew.seg_665", "src_text": "On further rounds of AL with two best strategies, we improve dissonance classification AUC to 0.75, which is the best performance that we have on the task so far.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In weiteren Runden mit zwei der besten Strategien verbessern wir die Diskriminanz von UC bis zu fünf Punkten, was die beste Leistung ist, die wir bisher erreicht haben.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_203.wav", "doc_id": "SLpqvupgvW.seg_203", "src_text": "Here are some examples from our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier sind einige Beispiele aus unserem Datensatz,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_148.wav", "doc_id": "wLqFAuDnKa.seg_148", "src_text": "And their results so a better performance when using the dev data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "bessere Leistung, wenn die dev-Daten verwendet werden.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_503.wav", "doc_id": "dvGkKzmIaN.seg_503", "src_text": "Protecting the copyright of large language models for embedding as services via backdoor watermark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "„Werden Sie mein Modell kopieren, um das Urheberrecht von Modellen in großen Sprachen für Einbettung und", "score": 15.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_211.wav", "doc_id": "SLpqvupgvW.seg_211", "src_text": "If the language model has access only to entity names, then the accuracy is only 60%, so there's a lot of room for improvement.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn das Sprachmodell nur auf Entitäten zugreifen kann, beträgt die Genauigkeit nur sechzig Prozent, so dass es noch viel Raum für Verbesserungen gibt.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_725.wav", "doc_id": "oaOHnMCwad.seg_725", "src_text": "And so that concludes our presentation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und so schließen wir unsere Präsentation ab, aber", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_824.wav", "doc_id": "WTTtiRKFZI.seg_824", "src_text": "And we show in the paper how this provides an argument against asymmetric structures of coordination, as these two, and for the symmetric structures, as these two.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir zeigen in dem Papier, wie dies ein Argument gegen asymmetrische Koordinationsstrukturen wie diese beiden und für asymmetrische Strukturen wie diese bietet. Siehe", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_831.wav", "doc_id": "GvEBWkLmuI.seg_831", "src_text": "However, these measures have various limitations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Maßnahmen haben jedoch verschiedene Einschränkungen, sie", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_371.wav", "doc_id": "gGbuDbHhyc.seg_371", "src_text": "So in practice, there's no reason to choose more complex WSL methods which require more computation time and disk space.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In der Praxis gibt es also keinen Grund, komplexere WSL-Methoden zu wählen, die mehr Rechenzeit und Speicherplatz erfordern.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_798.wav", "doc_id": "WTTtiRKFZI.seg_798", "src_text": "But it's also OK to say, \"Marge read yesterday this absolutely fascinating book about bees.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Aber es ist auch in Ordnung, zu sagen, dass Marjorie gestern dieses absolut faszinierende Buch über Bienen gelesen hat.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_32.wav", "doc_id": "aQpIWggfCo.seg_32", "src_text": "However, previous studies do not enable planning for specific goals and manual dataset annotation is expensive.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vorherige Studien ermöglichen jedoch nicht die Planung für spezifische Ziele, und die Annotierung von Manuskripten ist teuer.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_816.wav", "doc_id": "WTTtiRKFZI.seg_816", "src_text": "It's absent in the second example \"Homer came and sneezed.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Im zweiten Beispiel ist es offensichtlich, dass es der", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_466.wav", "doc_id": "SUkmfOTvGi.seg_466", "src_text": "Our paper investigated the problem of generalization using the Named Entity Recognition Task or the NER task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unsere Arbeit untersuchte das Problem der Generalisierung, indem sie die Aufgabe der Erkennung von benannten Entitäten oder die NER-Aufgabe verwendete.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_592.wav", "doc_id": "oeooqChmKK.seg_592", "src_text": "Recent works in tasks like question answering show that models can use pretrained-time knowledge to solve the task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "gegeben wird. Recent works in tasks like question answering show that models can use pre-trained time knowledge to solve the task. Aber die", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_143.wav", "doc_id": "wLqFAuDnKa.seg_143", "src_text": "It's the examples that carry most of the weight.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sind die Beispiele, die den größten", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_516.wav", "doc_id": "dvGkKzmIaN.seg_516", "src_text": "Existing works can be broadly classified into four categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Bestehende Werke können grob in vier Kategorien eingeteilt", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_475.wav", "doc_id": "SUkmfOTvGi.seg_475", "src_text": "And last but not least, we calculated the percentage change in F1 to assess the generalization of each model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "bewertet. Und zuletzt, aber nicht am wenigsten, haben wir den Prozentsatz der Änderung in F1 berechnet, um die Verallgemeinerung jedes Modells zu bewerten.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_104.wav", "doc_id": "uZBWfYjYnf.seg_104", "src_text": "That is the cross-attention mechanism, and you can see an example on the right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "dafür, wie wichtig es ist, aufmerksam zu sein.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_267.wav", "doc_id": "PIZEXUFLAR.seg_267", "src_text": "Therefore, in this work we want to investigate whether instruction tuning a multi-modal pre-trained models can actually improve generalisation to unseen multi-modal tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Daher möchten wir in dieser Arbeit untersuchen, ob die Anpassung von Anweisungen an multimodale Trainingsmodelle die Generalisierung zu nicht sichtbaren Multimodaltasks tatsächlich verbessern kann.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_448.wav", "doc_id": "hgIDlKNiFM.seg_448", "src_text": "One based on the weight of CamemBERT and trained on a 4 GB set of NACHOS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "basiert auf dem Gewicht von Camber und trainiert auf vier Gigabytes von Natchez, der andere", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_526.wav", "doc_id": "dvGkKzmIaN.seg_526", "src_text": "When a user send a sentence to the provider service the provider counts the trigger number in the sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn ein Benutzer einen Satz an den Dienst des Anbieters sendet, zählt der Anbieter die Trigger-Nummer im Satz.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_765.wav", "doc_id": "XejEJmgUmE.seg_765", "src_text": "Basically, we find that the models are sensitive to the perturbed sentences in similar ways.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "verändert Grundsätzlich stellen wir fest, dass die Modelle auf ähnliche Weise auf die Pertou-Sätze empfindlich", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_228.wav", "doc_id": "oYCKgTzTDy.seg_228", "src_text": "And to better evaluate our benchmark, we consider the six settings for training and evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um unseren Benchmark besser bewerten zu können, betrachten wir die sechs Einstellungen für Schulung und Bewertung. Der", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_96.wav", "doc_id": "uZBWfYjYnf.seg_96", "src_text": "Specific architectures are usually trained, introducing additional modules to be optimized.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Spezifische Architekturen werden normalerweise trainiert, indem zusätzliche Module eingeführt werden, die optimiert werden sollen. Langsame", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_795.wav", "doc_id": "WTTtiRKFZI.seg_795", "src_text": "So both these sentences are fine.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "hier aufgeführt, also sind die beiden Sätze", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_556.wav", "doc_id": "rISrKoXQCx.seg_556", "src_text": "So specifically, we first proposed to prompt language models with different prompt formats using the political questionnaires such as the political conference test.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir zuerst zwei Sprachmodelle mit unterschiedlichen Sprachmustern, die wir mit den politischen Fragen wie dem politischen Kompass-Test verwenden, um", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_785.wav", "doc_id": "WTTtiRKFZI.seg_785", "src_text": "Now the aim of this paper is to produce a novel argument for the symmetric structures of coordination, like these two and against the asymmetric structures of coordination, like these two.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Das Ziel dieses Papiers ist, ein neues Argument für die zu finden. die symmetrischen Strukturen der Koordination wie diese beiden und gegen die asymmetrischen Strukturen der Koordination wie diese beiden. OK,", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_797.wav", "doc_id": "WTTtiRKFZI.seg_797", "src_text": "It's okay the way instead of \"it\", we have this long NP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "faszinierend, okay, anstatt dessen, was wir haben.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_241.wav", "doc_id": "oYCKgTzTDy.seg_241", "src_text": "And we also find many interesting results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir stellen auch viele interessante Ergebnisse", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_764.wav", "doc_id": "XejEJmgUmE.seg_764", "src_text": "And after doing like several of these perturbations, we find that none of these noises are actually making the model like change its course in terms of how it shows us the MPP judgement print.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und nach mehreren dieser Störungen zu verfahren. Wir stellen fest, dass keines dieser Geräusche das Modell in seiner Entwicklung in Bezug auf die Art und Weise, wie es uns den Trend des MPP-Urteils zeigt, wirklich", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_726.wav", "doc_id": "oaOHnMCwad.seg_726", "src_text": "But if you'd like to learn more, feel free to check out our dashboard for the most updated analysis results and our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wenn Sie mehr wissen möchten, sehen Sie sich die aktualisierten Analysen und das Papier an.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_130.wav", "doc_id": "wLqFAuDnKa.seg_130", "src_text": "And we compared to state-of-the-art systems, so the best performing system, so the WMT evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir vergleichen zwei Zustände des Kunstsystems, die besten Leistungssysteme, die Bewertung der WM-T.", "score": 1.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_515.wav", "doc_id": "dvGkKzmIaN.seg_515", "src_text": "Finally, the watermark needs to be transferable to the attacker's services during the model extraction process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Schließlich muss das Wasserzeichen während des Modell-Extraktionsprozesses auf die Oberfläche des Angreifers übertragen werden.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_688.wav", "doc_id": "oaOHnMCwad.seg_688", "src_text": "And we're not trying to say that models themselves in data sets themselves have demographic identities and life experiences, but they do aggregate judgments and opinions of real people, and can thus represent certain positionalities over others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir werden nicht versuchen zu sagen, dass die Modelle in den Modellen und Datensätzen selbst demografische Identitäten und Lebenserfahrungen haben, aber die Aggregate von Urteilen und Meinungen von echten Menschen und können somit Positionen für andere darstellen. So", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_373.wav", "doc_id": "gGbuDbHhyc.seg_373", "src_text": "Their performance gain and practicality are heavily overestimated.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ihr Leistungsgewinn und ihre Praktikabilität werden stark überschätzt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_443.wav", "doc_id": "hgIDlKNiFM.seg_443", "src_text": "To answer this question, we compare DrBERT with our ChuBERT model, which is based on anonymized data obtained from the Nantes University Hospital data warehouse.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um diese Frage zu beantworten, vergleichen wir das Bert-Modell mit unserem Schubert-Modell, das auf anonymisierten Daten basiert, die vom Non-University Hospital Data Warehouse stammen.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_297.wav", "doc_id": "PIZEXUFLAR.seg_297", "src_text": "So we also did one experiment.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir führen auch ein einziges Experiment durch,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_437.wav", "doc_id": "hgIDlKNiFM.seg_437", "src_text": "And finally, we conclude about the experiments and give you more details about how to access those models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir zu den Experimenten, und geben Ihnen mehr Einzelheiten darüber, wie man auf die Modelle zugreifen kann.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_320.wav", "doc_id": "dJGfOSFgZO.seg_320", "src_text": "We developed this method to comprehensively cover chat model behaviors that have been suggested to affect chat quality in recent literature.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir haben diesen Ansatz entwickelt, um die Chat-Qualität und -Literatur umfassend abzudecken. Das B.C.E.V.", "score": 1.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_457.wav", "doc_id": "hgIDlKNiFM.seg_457", "src_text": "However, our experiment on control pre-training using the weight and tokenization of CamemBERT trained on the four GB subset of NACHOS showed comparable results to those obtained with DrBERT 4 GB from-scratch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "konsistenter Lernmethode, die wir mit dem White und Tokenizer verwenden, 2 Permutation-based, train on the 4GB subset of Natschläge, show comparable results to those obtained with Dr. Bert", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_543.wav", "doc_id": "dvGkKzmIaN.seg_543", "src_text": "That's all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das ist alles, danke.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_251.wav", "doc_id": "oYCKgTzTDy.seg_251", "src_text": "The orange line is Cross-lingual Zero-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "orangefarbene Linie ist die Übersetzung zwischen Sprachen", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_126.wav", "doc_id": "wLqFAuDnKa.seg_126", "src_text": "At the time of publication, it achieved state-of-the-art in hundreds of NLP tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "umfasst. Die Tamal-Herstellung ist ein Kunsthandwerk, das in Hunderten von Aufgaben besteht.", "score": 28.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_228.wav", "doc_id": "oYCKgTzTDy.seg_228", "src_text": "And to better evaluate our benchmark, we consider the six settings for training and evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Sprachfamilien. Und um unsere Benchmark besser zu bewerten, betrachten wir die sechs Einstellungen für Training und Bewertung. Der", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_691.wav", "doc_id": "oaOHnMCwad.seg_691", "src_text": "So to study data set and model positionality, we actually compare the annotations with real users with existing datasets and models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um die Datensatz- und Modellpositionierung zu studieren, vergleichen wir also die Annotatoren mit realen Benutzern mit existierenden Datensätzen und Modellen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_749.wav", "doc_id": "XejEJmgUmE.seg_749", "src_text": "So here the sentences are still coming from a, relevant data sets but it's not from the same data set that you are evaluating with.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier kommen die Sätze noch immer aus relevanten Datensätzen, aber nicht aus demselben Datensatz, mit dem Sie bewerten. Und wir", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_466.wav", "doc_id": "SUkmfOTvGi.seg_466", "src_text": "Our paper investigated the problem of generalization using the Named Entity Recognition Task or the NER task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unsere Arbeit untersuchte das Problem der Verallgemeinerung unter Verwendung der Named Entity Recognition Task oder der NER-Aufgabe.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_115.wav", "doc_id": "uZBWfYjYnf.seg_115", "src_text": "And we compare also with the state-of-the-art architecture specifically tailored for simultaneous pre-translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und wir vergleichen sie auch mit dem Zustand der Architektur, der speziell für die Simultanübersetzung zugeschnitten ist. Dies", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_466.wav", "doc_id": "SUkmfOTvGi.seg_466", "src_text": "Our paper investigated the problem of generalization using the Named Entity Recognition Task or the NER task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "uns beginnen. Unser Papier untersuchte das Problem der Generalisierung, wobei die Aufgabe der Erkennung benannter Entitäten oder die NER-Aufgabe verwendet wurden.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_730.wav", "doc_id": "XejEJmgUmE.seg_730", "src_text": "Language model acceptability judgments are not always robust to context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Twenty-Three“ willkommen zu heißen. Es ist", "score": 28.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_876.wav", "doc_id": "GvEBWkLmuI.seg_876", "src_text": "And finally, there should really be increased transparency about bias mitigation methods, because for instance, like these positive stereotypes, we don't know if it's because there is some sort of weird overly-excessive value alignment going on, or maybe some other anti-stereotyping methods that are resulting in these pernicious patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wenn wir das nicht tun, und schließlich sollte es eine erhöhte Transparenz über Biaismen-Methode geben. Weil zum Beispiel diese positiven Stereotypen wir nicht wissen, ob es deshalb ist, weil es irgendein Sorte von Weird ist. Übermäßige Wertalinierung, die im Gange ist, oder vielleicht andere, wie anti-stereotypische Methoden, die zu diesen schädlichen Mustern führen,", "score": 51.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_594.wav", "doc_id": "oeooqChmKK.seg_594", "src_text": "For example, in the sentence, \"John saw the newly elected president on TV.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wird. In dem Satz sah John zum Beispiel den neu gewählten Präsidenten im Fernsehen. Prä-Trainingsparameter", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_352.wav", "doc_id": "gGbuDbHhyc.seg_352", "src_text": "We can't stop on this problem setting, but this implies that additional manual annotations are required in weakly supervised learning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir hegen Zweifel an dieser Problemstellung, da dies impliziert, dass zusätzliche manuelle Anmerkungen beim Wochenplanerfordernis erforderlich sind,", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_225.wav", "doc_id": "oYCKgTzTDy.seg_225", "src_text": "So to this end we propose XSemPLR.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher schlagen wir vor, ein Beispiel zu verwenden:", "score": 34.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_669.wav", "doc_id": "FLkGnzVRew.seg_669", "src_text": "In summary, we find that PRC is a simple AL strategy for rare class acquisition and cold starting AL with appropriately designed transfer learning task and help significantly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zusammenfassend finden wir, dass die P.R.C. eine einfache A-Strategie für die Wiederaufnahme der Klasse und den Start mit ordnungsgemäß gestalteten Transferaufgaben ist und hilft.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_90.wav", "doc_id": "TVCREhgqUP.seg_90", "src_text": "We approximate this with a GPU-friendly continuous relaxation that also allows us to backpropagate through the solution and learn the linguistically more plausible permutations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "verbunden ist. Wir approximieren dies mit einer gPUs-freundlichen kontinuierlichen Entspannung, die es uns auch ermöglicht, durch die Lösung zurückzupropagieren und die sprachlich plausibleren Permutationen zu lernen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_536.wav", "doc_id": "dvGkKzmIaN.seg_536", "src_text": "Meanwhile, we also apply KS test and use its p-value as the third metric.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In der Zwischenzeit wenden wir auch den KS-Test an und verwenden seine p-Werte als dritte Maßzahl.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_188.wav", "doc_id": "SLpqvupgvW.seg_188", "src_text": "Here are the different sampling methods we've used.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier sind die unterschiedlichen Stichprobenmethoden, die wir verwenden. Wenn", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_609.wav", "doc_id": "oeooqChmKK.seg_609", "src_text": "Generally, background knowledge is learned during the pretraining of large language models, while entity-specific knowledge is typically observed at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Im Allgemeinen wird Hintergrundwissen während der Vorbereitung von großen Sprachmodellen erlernt, während spezifische Kenntnisse einer Einheit typischerweise während der Infanteriezeit beobachtet werden.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_372.wav", "doc_id": "gGbuDbHhyc.seg_372", "src_text": "To summarize, we showed that recent WSL approaches require clean, manually annotated samples for them to work properly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zusammenfassend zeigen wir, dass moderne WSL-Ansätze saubere, manuell annotierte Proben erfordern, damit sie richtig funktionieren; ihr", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_675.wav", "doc_id": "oaOHnMCwad.seg_675", "src_text": "I'm Jenny, a first year PhD student at Carnegie Mellon University and today I'll be presenting your work NLPositionality characterising design biases of datasets and Models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ich bin Jeni, eine erste Jahrgangsstudentin der Carnegie Mellon University, und heute werde ich meine Arbeit präsentieren, die ich in einer allgemeinen Position ausführe, indem ich Design durch Visualisierung von Datenmodellen zeichne.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_848.wav", "doc_id": "GvEBWkLmuI.seg_848", "src_text": "So the Marked Words method draws upon the sociolinguistic concept of \"markedness\", which states that there is an unmarked default, and any group that differs from that default is linguistically marked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher bezieht sich die Markierungsmethode auf das soziolinguistische Konzept der Markierung, das besagt, dass es einen unmarkierten Defekt gibt und jede Gruppe, die sich von diesem Defekt unterscheidet, sprachlich markiert ist.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_694.wav", "doc_id": "oaOHnMCwad.seg_694", "src_text": "The first step is to re annotate data sets with diverse annotators.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der erste Schritt besteht darin, Datensätze mit verschiedenen Annotatoren neu zu annotieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_200.wav", "doc_id": "SLpqvupgvW.seg_200", "src_text": "For recipes, we additionally show their images, again from Wikipedia, so that the annotators know how they look like.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "für Rezepte zeigen wir zusätzlich ihre Bilder von Wikipedia, damit die Annotatoren sehen, wie sie aussehen.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_31.wav", "doc_id": "aQpIWggfCo.seg_31", "src_text": "Creating the dataset is an essential step to this end.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Erstellung eines Datensatzes ist ein entscheidender Schritt. In früheren Studien", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_162.wav", "doc_id": "SLpqvupgvW.seg_162", "src_text": "Our goal is to understand users’ language when they want to make a choice.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unser Ziel ist es, die Sprache des Benutzers zu verstehen, wenn er eine Wahl treffen", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_231.wav", "doc_id": "oYCKgTzTDy.seg_231", "src_text": "And for example, we train the English model on English query and during inference we translate the German query using API to English and then use the trained model to predict the SQL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und zum Beispiel trainieren wir ein englisches Modell auf englische Anfragen und während der Inferenz übersetzen wir die deutsche Anfrage mit Hilfe von API in Englisch und verwenden dann das trainierte Modell, um die Sequenz vorherzusagen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_498.wav", "doc_id": "SUkmfOTvGi.seg_498", "src_text": "And lastly, please make sure to check out our paper, our data set and if you have any questions, feel free to contact me.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und schließlich, bitte stellen Sie sicher, dass Sie unser Papier und unseren Datensatz überprüfen, und wenn Sie Fragen haben, zögern Sie nicht, mich zu kontaktieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_728.wav", "doc_id": "XejEJmgUmE.seg_728", "src_text": "Hi, everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo alle,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_606.wav", "doc_id": "oeooqChmKK.seg_606", "src_text": "The resolution of a given pronoun requires two types of information.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Lösung eines gegebenen Pronomen erfordert zwei Arten von Informationen:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_242.wav", "doc_id": "oYCKgTzTDy.seg_242", "src_text": "So, regarding analysis of monolingual models, we evaluate on two groups of models including Encoder-PTR which stands for Multilingual Pretrained Encoders with Pointer-based Decoders, such as XLM-R + PTR and mBERT + PTR.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "fest, so dass wir in Bezug auf die Analyse von monolingualen Modellen zwei Gruppen von Modellen bewerten. Dazu gehört Encoder PDR, der für multilinguale vorbereitete Encoder mit pointerbasierten Decodern wie xlnr + PDR und Bert + PDR", "score": 54.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_353.wav", "doc_id": "gGbuDbHhyc.seg_353", "src_text": "But like an elephant in the room this necessity is often overlooked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "aber wie ein Elefant im Raum, wird diese Notwendigkeit oft übersehen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_69.wav", "doc_id": "TVCREhgqUP.seg_69", "src_text": "First, we tag each input token with an unordered multiset of tokens that will appear in the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "bezeichnen wir jeden Input-Token mit einer unbestellten Multisatz von Tokens, die im Ausgang erscheinen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_795.wav", "doc_id": "WTTtiRKFZI.seg_795", "src_text": "So both these sentences are fine.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also sind beide Sätze gut,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_119.wav", "doc_id": "uZBWfYjYnf.seg_119", "src_text": "If you want to discover more results, read our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn Sie mehr Ergebnisse entdecken möchten, lesen Sie unser Papier,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_69.wav", "doc_id": "TVCREhgqUP.seg_69", "src_text": "First, we tag each input token with an unordered multiset of tokens that will appear in the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "tagen wir jedes Eingabe-Token mit einem unsortierten Satz von Tokens, die in der Ausgabe erscheinen werden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_1.wav", "doc_id": "aQpIWggfCo.seg_1", "src_text": "I'm here to introduce our work \"Distilling Script Knowledge from Large Language Models for Constrained Language Planning\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ich bin hier, um unsere Arbeit vorzustellen: Distinguishing script knowledge from language models for constrained language planning. In", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_530.wav", "doc_id": "dvGkKzmIaN.seg_530", "src_text": "Copyright verification is to detect whether a model behind another service contains the word mark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ziel-Embedding. Die Urheberrechtsprüfung soll feststellen, ob ein Modell hinter einem anderen Dienst das Wasserzeichen enthält.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_542.wav", "doc_id": "dvGkKzmIaN.seg_542", "src_text": "As shown in the figures, it's hard to distinguish between, the backdoor embeddings and normal embeddings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wie in den Abbildungen gezeigt, ist es schwierig, zwischen Vektor- und Normal-Einbettungen zu unterscheiden.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_698.wav", "doc_id": "oaOHnMCwad.seg_698", "src_text": "Our frame is largely enabled through Lab in the Wild and online crowdsourcing platform for where HCI collaborator.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unser Framework ist weitgehend über Lab und Wild verfügbar, eine Online-Crowdsourcing-Plattform für ehemalige Mitarbeiter", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_186.wav", "doc_id": "SLpqvupgvW.seg_186", "src_text": "Do you mean A or B?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Meinen Sie A oder B?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_682.wav", "doc_id": "oaOHnMCwad.seg_682", "src_text": "This is an example of a design bias where we see systematic performance differences of technology between populations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies ist ein Beispiel für eine Design-Byzanz, bei der wir systematische Leistungsunterschiede von Technologien zwischen Populationen sehen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_688.wav", "doc_id": "oaOHnMCwad.seg_688", "src_text": "And we're not trying to say that models themselves in data sets themselves have demographic identities and life experiences, but they do aggregate judgments and opinions of real people, and can thus represent certain positionalities over others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir versuchen nicht zu sagen, dass die Modelle und die Datensätze selbst Demographien und Lebenserfahrungen haben, aber sie aggregieren die Urteile und Meinungen von realen Menschen und können daher bestimmte Positionalitäten gegenüber anderen darstellen.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_157.wav", "doc_id": "wLqFAuDnKa.seg_157", "src_text": "For more details, please come to the full presentation of the paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Für weitere Details kommen Sie bitte zur vollständigen Präsentation des Papiers.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_875.wav", "doc_id": "GvEBWkLmuI.seg_875", "src_text": "We should also be using an intersectional lens to study biases and harms because there's a lot of things that might be overlooked if we don't do that.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir sollten auch intersektionale Linsen verwenden, um Biazenz und Schäden zu untersuchen, weil es viele Dinge gibt, die man übersehen könnte, wenn man sie nicht macht.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_387.wav", "doc_id": "WBLMIsdIrq.seg_387", "src_text": "For example, how would we translate \"mole\" in this sentence?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "würden wir mehrere Sätze übersetzen?", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_230.wav", "doc_id": "oYCKgTzTDy.seg_230", "src_text": "We use Google Translate API to translate source to the target language, then use monolingual model to train and evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Test“, bei dem wir Google Translate API verwenden, um Quellen in die Zielsprache zu übersetzen, und dann ein monolinguales Modell zum Training und zur Bewertung.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_144.wav", "doc_id": "wLqFAuDnKa.seg_144", "src_text": "The summary of our experimental results is that the example quality is more important than the similarity to the source sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Teil des Weges tragen. Die Zusammenfassung unserer experimentellen Ergebnisse ist, dass die Qualität der Beispiele wichtiger ist als die Ähnlichkeit zur Quatsel-Satz. Es", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_380.wav", "doc_id": "gGbuDbHhyc.seg_380", "src_text": "You can find it via the QR code on this slide.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sie können ihn über die QR-Code auf dieser Seite", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_422.wav", "doc_id": "WBLMIsdIrq.seg_422", "src_text": "And if we use word f-measure, then models with and without context have comparable performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und wenn wir das Wort F-Measure verwenden, dann haben Modelle mit und ohne Kontext vergleichbare Leistung.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_484.wav", "doc_id": "SUkmfOTvGi.seg_484", "src_text": "To our next question, what causes the performance drop of some models, We had two hypothesis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zur nächsten Frage: Was verursacht den Leistungsabfall einiger Modelle? Wir haben zwei Hypothesen,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_300.wav", "doc_id": "PIZEXUFLAR.seg_300", "src_text": "So this shows the effect of different fine-tuning strategies on the model sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies zeigt die Auswirkungen verschiedener Strategien zur Anpassung der Empfindlichkeit des Modells,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_799.wav", "doc_id": "WTTtiRKFZI.seg_799", "src_text": "So the reasoning here is that this is possible because even though this sentence violates the general grammatical principle that direct objects should be next to the verb, it satisfies the principle of dependency length minimization, which says that shorter dependencies are preferred.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "gelesen hat. Der Grund hierfür ist, dass dies möglich ist, weil dieser Satz den allgemeinen grammatischen Prinzip, dass direkte Objekte dem Subjekt nachgestellt werden, verletzt. Es erfüllt das Prinzip der Abhängigkeitslängenminimierung, das besagt, dass kürzere Abhängigkeiten bevorzugt werden.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_866.wav", "doc_id": "GvEBWkLmuI.seg_866", "src_text": "So for example, the words describing Latina women include things like \"vibrant\" and \"curvaceous\" which connect to a trope of tropicalism.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Frau von dunkler Hautfarbe, so dass zum Beispiel die Wörter, die eine lateinische Frau beschreiben, Dinge wie Vibrant und Körperschönheit enthalten. Um die Verbindung zu einem Trope des Tropikalismus zu verbinden,", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_79.wav", "doc_id": "TVCREhgqUP.seg_79", "src_text": "We continue this process until every token from the first stage has been visited exactly once.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir setzen diesen Prozess fort? Bis zu dem Zeitpunkt, an dem jedes Token aus der ersten Phase genau einmal besucht wurde.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_399.wav", "doc_id": "WBLMIsdIrq.seg_399", "src_text": "And this is done by measuring how much information the context C provides about the target Y, given the source X. You can think of CXMI as the information gained from giving context to the model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "durch die Messung der Menge an Informationen, die die Zielsprache liefert, untermauert. Sie können CxMy als die Information betrachten, die aus dem Kontakt mit dem Modell gewonnen wurde.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_106.wav", "doc_id": "uZBWfYjYnf.seg_106", "src_text": "A word is emitted if the attention is not concentrated, that is, its sum is below a certain threshold alpha towards the last lambda speech frames, meaning that the received information is enough stable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ein Wort wird ausgesprochen, wenn die Spannung nicht konzentriert ist, d.h. diese Summe liegt unter einem bestimmten Alpha-Schwellenwert gegenüber den letzten Lambda-Sprachrahmen, was bedeutet, dass die empfangenen Informationen stabil sind.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_200.wav", "doc_id": "SLpqvupgvW.seg_200", "src_text": "For recipes, we additionally show their images, again from Wikipedia, so that the annotators know how they look like.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "an. Für Rezepte zeigen wir außerdem ihre Bilder von Wikipedia an, damit die Annotatoren wissen, wie sie aussehen.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_138.wav", "doc_id": "wLqFAuDnKa.seg_138", "src_text": "In our experiments, we settled for a 5-shot prompting strategy where we just marked each sentence that we provide to the system, with the language it's in.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In unseren Experimenten haben wir uns für eine fünf-Schuss-Strategie entschieden. Wo wir einfach markieren, dass wir die Sätze, die wir dem System mit der Sprache bereitstellen, mit „German“ markieren.", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_636.wav", "doc_id": "FLkGnzVRew.seg_636", "src_text": "This belief and action are inconsistent, and they are in dissonance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist dies ein Widerspruch, der sich nicht rechtfertigen lässt, und es ist", "score": 38.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_515.wav", "doc_id": "dvGkKzmIaN.seg_515", "src_text": "Finally, the watermark needs to be transferable to the attacker's services during the model extraction process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Schließlich muss das Wasserzeichen während des Modellentfernungsprozesses auf die Oberfläche des Angreifers übertragbar sein.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_732.wav", "doc_id": "XejEJmgUmE.seg_732", "src_text": "So in this work, we revisit the minimal pair paradigms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In dieser Arbeit überprüfen wir also das minimale Paar-Paradigma.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_232.wav", "doc_id": "oYCKgTzTDy.seg_232", "src_text": "And we'll also test Monolingual Model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir werden auch das Modell der Monolinguistik testen.", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_816.wav", "doc_id": "WTTtiRKFZI.seg_816", "src_text": "It's absent in the second example \"Homer came and sneezed.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In einem zweiten Beispiel, Homer Came and Sneid, haben wir", "score": 56.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_806.wav", "doc_id": "WTTtiRKFZI.seg_806", "src_text": "It violates one principle, but it satisfies another one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Es verletzt ein Prinzip, aber es erfüllt ein anderes.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_133.wav", "doc_id": "wLqFAuDnKa.seg_133", "src_text": "The prompting has a big influence on the performance of the LLMs for translation, as we can see in a simple experiment, where we used one-shot prompting and provided two different prompts for each sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "hat einen großen Einfluss auf die Leistung der Übersetzung, wie wir in einem einfachen Experiment sehen können, in dem wir nur eine Vorhaltung verwenden und zwei verschiedene Vorhaltungen für einen Satz liefern.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_225.wav", "doc_id": "oYCKgTzTDy.seg_225", "src_text": "So to this end we propose XSemPLR.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zu diesem Zweck schlagen wir ein Beispiel", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_5.wav", "doc_id": "aQpIWggfCo.seg_5", "src_text": "However, previous work mainly focuses on planning for the abstract goals of stereotypical activities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Allerdings konzentrierte sich die vorherige Arbeit hauptsächlich auf die Planung für die abstrakten Ziele stereotyper Aktivitäten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_127.wav", "doc_id": "wLqFAuDnKa.seg_127", "src_text": "In this work, we present the first systematic study of large language model prompting for machine translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In dieser Arbeit stellen wir die erste systematische Studie zum Large-Language-Modell-Prompting für die maschinelle Übersetzung vor.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_860.wav", "doc_id": "GvEBWkLmuI.seg_860", "src_text": "So instead to do that, we'll turn to the results from our Marked Words method to show how these positive-seeming words facilitate stereotypes and essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "also werden wir stattdessen zu den Ergebnissen aus unserem Markwort-Verfahren übergehen, um zu zeigen, wie diese positiven Wörter Stereotypen und Essentialisierungen erleichtern.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_859.wav", "doc_id": "GvEBWkLmuI.seg_859", "src_text": "And in fact, this lexicon doesn't really capture many of the harmful patterns that we saw in the earlier slides well at all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und tatsächlich hat das Lexikon nicht wirklich viele der schädlichen Muster, die wir in den früheren Ausgaben gesehen haben.", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_121.wav", "doc_id": "uZBWfYjYnf.seg_121", "src_text": "Thanks for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank für Ihre Aufmerksamkeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_547.wav", "doc_id": "rISrKoXQCx.seg_547", "src_text": "Today I'm presenting our work \"From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und präsentiere heute unsere Arbeit, indem wir Daten von Sprachmodellen in Downstream-Tasks vorbereiten, die Spuren politischer Überzeugungen aufsprühen lassen, die zu unfairen und unfairen Modellen führen.", "score": 38.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_595.wav", "doc_id": "oeooqChmKK.seg_595", "src_text": "Pretrained parameters can contain information about what presidents do and what a TV is but they cannot reliably know who this instance-specific entity \"John\" is, or who the new president is, because the president might have changed since pretraining.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vortrainierte Parameter können Informationen über das, was Präsidenten tun, und über die Aktivitäten enthalten, aber sie können nicht zuverlässig wissen, wer diese spezifische Einheit John ist oder wer der neue Präsident ist, weil der Präsident sich möglicherweise seit der Vorvorbereitung geändert hat. Daher", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_300.wav", "doc_id": "PIZEXUFLAR.seg_300", "src_text": "So this shows the effect of different fine-tuning strategies on the model sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies zeigt die Wirkung einer unterschiedlichen Strategie der Modellverarbeitung auf die Modellsensitivität.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_620.wav", "doc_id": "oeooqChmKK.seg_620", "src_text": "In the Background-Inference setting, we provide the fictional occupation \"mirituer\" instead of politician because \"mirituer\" is unlikely to be contained in the pretrained parameters.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Im Hintergrund, in dem wir die fiktive Beschäftigung Meritua anstelle des Politikers vorsehen, weil Meritua unwahrscheinlich in der vorgebildeten Parameter enthalten ist.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_27.wav", "doc_id": "aQpIWggfCo.seg_27", "src_text": "We only keep the script if the target goal scores the highest in the goal set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir behalten das Skript nur dann bei, wenn der Zielgeist die höchste Punktzahl auf dem Geistessicht hat.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_187.wav", "doc_id": "SLpqvupgvW.seg_187", "src_text": "Where A and B are samples from Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wobei A und B Proben von Wikipedia sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_716.wav", "doc_id": "oaOHnMCwad.seg_716", "src_text": "We find this in the GPT 4 social acceptability task as well as the Dynahate task analysis as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir finden dies in der GPD-4-Social Acceptability Task, ebenso wie in den Diäten", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_745.wav", "doc_id": "XejEJmgUmE.seg_745", "src_text": "We extract grammatical sentences from Adjunct Island and then we add it as a prefix to both the acceptable query and the unacceptable query.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Struktur haben, grammatikalische Sätze aus Adgentilierer auszubauen. Dann fügen wir es als Präfix zu beiden der akzeptablen und der unakzeptablen Anfrage hinzu.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_380.wav", "doc_id": "gGbuDbHhyc.seg_380", "src_text": "You can find it via the QR code on this slide.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Sie über den QR-Code auf dieser Seite", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_237.wav", "doc_id": "oYCKgTzTDy.seg_237", "src_text": "And during inference we can use this model to translate German queries or Chinese queries, et cetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und während der Entwicklungsphase können wir dieses Modell auch verwenden. Um deutsche Anfragen oder chinesische Anfragen oder usw. zu übersetzen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_484.wav", "doc_id": "SUkmfOTvGi.seg_484", "src_text": "To our next question, what causes the performance drop of some models, We had two hypothesis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zur nächsten Frage: Was verursacht die Leistungseinbußen einiger Modelle? Wir haben zwei Hypothesen,", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_244.wav", "doc_id": "oYCKgTzTDy.seg_244", "src_text": "We found that Encoder-Decoder obtains the best performance on all nine datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir haben festgestellt, dass der Encoder-Decoder bei allen neun Datensätzen die beste Leistung erzielt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_414.wav", "doc_id": "WBLMIsdIrq.seg_414", "src_text": "So now we use our findings from our analysis to design a benchmark for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "unseren Analysen, um einen Benchmark für die Dokumententranskription zu erstellen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_63.wav", "doc_id": "TVCREhgqUP.seg_63", "src_text": "This can be complicated and sometimes a computationally expensive process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies kann kompliziert und manchmal ein rechnerisch teures", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_443.wav", "doc_id": "hgIDlKNiFM.seg_443", "src_text": "To answer this question, we compare DrBERT with our ChuBERT model, which is based on anonymized data obtained from the Nantes University Hospital data warehouse.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um diese Frage zu beantworten, vergleichen wir Dr. Bert mit unserem Shubert-Modell, das auf anonymisierten Daten basiert, die vom Universitätskrankenhaus stammen. Danach", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_127.wav", "doc_id": "wLqFAuDnKa.seg_127", "src_text": "In this work, we present the first systematic study of large language model prompting for machine translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In dieser Arbeit präsentieren wir die erste systematische Studie zur großen Sprachmodell-Unterstützung für maschinelle Übersetzung.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_869.wav", "doc_id": "GvEBWkLmuI.seg_869", "src_text": "This connects to an archetype that people have called the \"Strong Black Women\" archetype.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies verbindet sich mit einem Archetyp, den Leute den starken schwarzen Frauen-Archetyp nennen,", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_353.wav", "doc_id": "gGbuDbHhyc.seg_353", "src_text": "But like an elephant in the room this necessity is often overlooked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wie ein Elefant im Raum. Diese Notwendigkeit wird oft übersehen.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_254.wav", "doc_id": "oYCKgTzTDy.seg_254", "src_text": "We also find some other interesting findings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir finden auch einige andere interessante Ergebnisse:", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_88.wav", "doc_id": "TVCREhgqUP.seg_88", "src_text": "Our permutation method is very flexible, but it brings the challenge that finding the highest-scoring permutation is NP-hard.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unsere Permutationsmethode ist sehr flexibel, aber sie bringt die Herausforderung mit sich, dass das Finden der höchstpunktuellen Permutation schwierig ist, denn dies", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_551.wav", "doc_id": "rISrKoXQCx.seg_551", "src_text": "This has created a mixed blessing for language model applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "einer Blüte für eine Sprachmodellanwendung geschaffen. So konnten", "score": 28.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_46.wav", "doc_id": "aQpIWggfCo.seg_46", "src_text": "Please find more details of CoScript in our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Bitte finden Sie weitere Details des Korsscripts in unserem Papier.", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_453.wav", "doc_id": "hgIDlKNiFM.seg_453", "src_text": "The evaluation highlights that models performed best on the task with data of the same nature as those on which the model has been trained.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Bewertung der Modelle zeigt, dass das Modell am besten auf der Aufgabe mit Daten der gleichen Natur wie die, auf denen das Modell trainiert wurde,", "score": 77.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_183.wav", "doc_id": "SLpqvupgvW.seg_183", "src_text": "The first speech bubble is chosen from a few manual prompts per domain.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die erste Sprachblase wird aus ein paar manuellen Prompts pro Domäne ausgewählt.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_41.wav", "doc_id": "aQpIWggfCo.seg_41", "src_text": "In summary, we establish the constrained language planning problem.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zusammenfassend stellen wir das eingeschränkte Sprachplanungsproblem auf, bewerten die", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_656.wav", "doc_id": "FLkGnzVRew.seg_656", "src_text": "Further, on iteratively fine-tuning on both tasks, we find that fine-tuning of CE tasks followed by further fine-tuning on debate yields a much better zero-shot performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dem besten Wert von 0,652. Zusätzlich finden wir, dass die iterativ feinjustierte CE-Tasks gefolgt von weiterer feinjustierung auf Debatten einen viel besseren Null-Shot-Performance ergibt, also ist", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_390.wav", "doc_id": "WBLMIsdIrq.seg_390", "src_text": "So, depending on context, the meaning of the word changes, and therefore its translation changes as well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dann bezieht sich mehr auf eine Geburtsmarkierung, also hängt die Bedeutung des Wortes von dem Kontext ab und ändert sich daher auch seine Übersetzung.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_34.wav", "doc_id": "aQpIWggfCo.seg_34", "src_text": "We appy our method for building a dataset of constrained language planning, named as CoScript.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir planen unsere Methode für die Erstellung einer Datensatzes von konstruierten Sprachplanungen, benannt als „Koskript“.", "score": 31.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_794.wav", "doc_id": "WTTtiRKFZI.seg_794", "src_text": "This is illustrated here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "werden kann. Das ist", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_553.wav", "doc_id": "rISrKoXQCx.seg_553", "src_text": "On the other hand, these different political opinions are inherently socially biased and might lead to potential fairness issues in downstream task applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "auf der anderen Seite sind diese unterschiedlichen politischen Meinungen sozial gefährlich und ich muss mich bemühen, potenzielle Gerechtigkeitsfragen in Downstream-Task-Applications zu vermeiden.", "score": 49.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_342.wav", "doc_id": "gGbuDbHhyc.seg_342", "src_text": "In this video, I would like to present our recent work \"Weaker Than You Think: A Critical Look at Weakly Supervised Learning.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In diesem Video möchte ich unsere jüngste Arbeit, 'Weaker than you think: A critical look at weekly surprise ratings', vorstellen.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_132.wav", "doc_id": "wLqFAuDnKa.seg_132", "src_text": "Finally, we provide some recommendations for prompt selection strategies.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Schließlich geben wir einige Empfehlungen für Strategien zur schnellen Auswahl. Die", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_869.wav", "doc_id": "GvEBWkLmuI.seg_869", "src_text": "This connects to an archetype that people have called the \"Strong Black Women\" archetype.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies führt zu einem Archetyp, den die Leute die starke schwarze Frau genannt haben,", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_616.wav", "doc_id": "oeooqChmKK.seg_616", "src_text": "For example, because new occupations have developed since the time of pretraining.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zum Beispiel, weil neue Berufe seit der Zeit der vorbereitenden Ausbildung entwickelt wurden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_184.wav", "doc_id": "SLpqvupgvW.seg_184", "src_text": "The second one, which is the alternative question is generated as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die zweite, die alternative Frage, wird wie folgt erzeugt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_317.wav", "doc_id": "dJGfOSFgZO.seg_317", "src_text": "However, we believe there is a more precise and reliable strategy for dimensional dialogue evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir glauben jedoch, dass es eine präzisere und zuverlässigere Strategie für die dimensionale Dialogbewertung gibt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_230.wav", "doc_id": "oYCKgTzTDy.seg_230", "src_text": "We use Google Translate API to translate source to the target language, then use monolingual model to train and evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir verwenden die Google Translate API, um die Quelltexte in die Ziel-Sprache zu übersetzen, und verwenden dann ein einlinguales Modell, um die Übersetzungen zu bewerten. Trainieren und Evaluieren.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_732.wav", "doc_id": "XejEJmgUmE.seg_732", "src_text": "So in this work, we revisit the minimal pair paradigms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Arbeit überprüfen wir also das Minimalpaar-Paradigma.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_818.wav", "doc_id": "WTTtiRKFZI.seg_818", "src_text": "In such cases, the left conjunct prefers to be shorter; the most of the biggest difference between the two conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Recht hat, also in solchen Fällen bevorzugt der linke Kongruent zu kürzer sein, damit der größte Unterschied zwischen den beiden Ländern verringert", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_2.wav", "doc_id": "aQpIWggfCo.seg_2", "src_text": "In everyday life, humans often plan their actions by following step-by-step instructions in the form of goal-oriented scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "der alltäglichen Lebensweise planen Menschen oft ihre Aktionen, indem sie Schritt für Schritt Anweisungen in Form von garantierten Skripten befolgen.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_852.wav", "doc_id": "GvEBWkLmuI.seg_852", "src_text": "So in our method, we first designate what the unmarked and marked groups are, and then we compare the personas using the Fightin’ Words method, which is basically using weighted log-odds ratios to distinguish the top words for each marked group.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In unserer Methode bestimmen wir zunächst, was die unmarkierten und markierten Gruppen sind, und dann vergleichen wir die Personen mit dem Fighting Words-Verfahren, das im Wesentlichen die gewichteten Logits-Ratios verwendet, um die wichtigsten Wörter für jede markierte Gruppe zu unterscheiden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_851.wav", "doc_id": "GvEBWkLmuI.seg_851", "src_text": "And more broadly, dominant groups in society are both linguistically and socially unmarked, while the marginalized groups are usually marked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "noch, die dominierenden Gruppen in der Gesellschaft sind sowohl sprachlich als auch sozial unmarkiert, während die marginalisierten Gruppen normalerweise markiert sind.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_344.wav", "doc_id": "gGbuDbHhyc.seg_344", "src_text": "I'd like to begin with a brief introduction to weak supervision and weakly supervised learning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ich möchte gerne mit einer kurzen Einführung in die wöchentliche Überwachung und die wöchentliche Überwachung beginnen.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_186.wav", "doc_id": "SLpqvupgvW.seg_186", "src_text": "Do you mean A or B?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Möchten Sie A oder B,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_145.wav", "doc_id": "wLqFAuDnKa.seg_145", "src_text": "So it's important to select the examples from high-quality translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ist wichtig, die Beispiele aus den Übersetzungen von", "score": 52.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_829.wav", "doc_id": "GvEBWkLmuI.seg_829", "src_text": "This work is done in collaboration with Esin Durmus and Dan Jurafsky.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wird in Zusammenarbeit mit Esnur Dündar und Dündar gemacht.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_374.wav", "doc_id": "gGbuDbHhyc.seg_374", "src_text": "Our concrete recommendations for future work are as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unsere konkreten Empfehlungen für die zukünftige Arbeit sind wie folgt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_499.wav", "doc_id": "SUkmfOTvGi.seg_499", "src_text": "Thank you so much.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_703.wav", "doc_id": "oaOHnMCwad.seg_703", "src_text": "We've then compared these, annotations with Social Chemistry, Delphi and GPT 4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir haben diese Annotationen dann mit Social Chemistry, Delphi und GPT4 verglichen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_761.wav", "doc_id": "XejEJmgUmE.seg_761", "src_text": "Now this and this is very large like this effect, increases throughout the context length and this would probably affect like newer language models which has large context window.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Jetzt ist dies sehr groß, wie dieser Effekt sich erhöht. durch die gesamte Kontextlänge und dies würde wahrscheinlich die neueren Sprachmodelle, die einen großen Kontextfenster haben,", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_420.wav", "doc_id": "WBLMIsdIrq.seg_420", "src_text": "First of all, when we use corpus-level metrics: so for BLEU, we find that context-agnostic models have the best performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zunächst einmal, wenn wir Korrelationsmetriken verwenden, finden wir, dass die kollinearen Modelle die beste Leistung erbringen.", "score": 33.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_158.wav", "doc_id": "wLqFAuDnKa.seg_158", "src_text": "Thank you very much.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vielen Dank.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_393.wav", "doc_id": "WBLMIsdIrq.seg_393", "src_text": "And some people have suggested targeted evaluation on context-dependent translations, but these resources only support limited types of context-dependent translations and limited sets of languages since they usually rely on domain knowledge and human curation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "einige Personen haben vorgeschlagen, eine gezielte Bewertung von kontextabhängigen Übersetzungen, aber diese Ressourcen unterstützen nur begrenzte Arten von kontextabhängigen Übersetzungen und begrenzte Sprachgruppen, da sie normalerweise auf den Hauptwissenswert und die menschliche Schöpfung vertrauen.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_33.wav", "doc_id": "aQpIWggfCo.seg_33", "src_text": "Thus, we follow the idea of symbolic knowledge distillation, to distil constrained language planning datasets from large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher folgen wir der Idee der symbolischen Wissensdistillation, um eingeschränkte Sprachplanungsdatensätze aus Sprachmodellen zu distillieren.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_686.wav", "doc_id": "oaOHnMCwad.seg_686", "src_text": "And as a researcher, positionality can influence the research process and its outcomes and results because it can change the decisions that researchers make.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und als Forscher kann Positionalität den Forschungsprozess und dessen Ergebnisse und Ergebnisse beeinflussen, da sie die Entscheidungen beeinflussen, die Forscher treffen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_738.wav", "doc_id": "XejEJmgUmE.seg_738", "src_text": "These days large language models are coming up with longer and longer context windows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Tage kommen größere Sprachmodelle mit längeren und", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_240.wav", "doc_id": "oYCKgTzTDy.seg_240", "src_text": "So during training, we train it on English queries or the combination of English and German Few-shot queries to train a multilingual model to predict the SQL output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Während des Trainings werde ich mich auf die englische Sprache oder eine Kombination aus englischen und deutschen Suchbegriffen konzentrieren, um ein mehrsprachiges Modell zu trainieren und die Ausgabe vorherzusagen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_487.wav", "doc_id": "SUkmfOTvGi.seg_487", "src_text": "For data overfitting, we saw that from the graph on the right, the red best fit line has a gradient that is greater than one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Für die adaptive Überpassung sahen wir, dass die rote bestgepasste Linie auf dem Graphen rechts eine Steigung hat, die größer ist als 1. Dies", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_103.wav", "doc_id": "uZBWfYjYnf.seg_103", "src_text": "And leverage the knowledge already acquired by the model through the attention mechanism between audio input and textual output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die bereits erworbenen Kenntnisse werden durch das Modell durch den Wahrnehmungsmechanismus zwischen Audioeingabe und Textausgabe übertragen, das", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_470.wav", "doc_id": "SUkmfOTvGi.seg_470", "src_text": "At the same time, if we do observe poor generalization, what causes the performance drop of these models?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Gleichzeitig, wenn wir eine schlechte Generalisierung beobachten, was verursacht den Leistungsabfall dieser Modelle?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_872.wav", "doc_id": "GvEBWkLmuI.seg_872", "src_text": "More broadly, we find that the words for each marked group pretty much just reflect very essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "stellen fest, dass die Wörter in der Markgruppe sehr wesentlich sind. Basierend auf diesen Mustern", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_816.wav", "doc_id": "WTTtiRKFZI.seg_816", "src_text": "It's absent in the second example \"Homer came and sneezed.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "zweiten Beispiel, in dem", "score": 31.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_851.wav", "doc_id": "GvEBWkLmuI.seg_851", "src_text": "And more broadly, dominant groups in society are both linguistically and socially unmarked, while the marginalized groups are usually marked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Gruppen in der Gesellschaft sind entweder sprachlich oder sozial unmarkiert, während die Randgruppen üblicherweise markiert sind.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_61.wav", "doc_id": "TVCREhgqUP.seg_61", "src_text": "The trees are intended to capture the compositional process that relates utterances with the logical forms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Bäume sollen den kompositorischen Prozess erfassen, der Äußerungen mit logischen Formen verbindet.", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_751.wav", "doc_id": "XejEJmgUmE.seg_751", "src_text": "Finally, we can choose sentences from a completely unrelated domain such as Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Schließlich können wir Sätze aus einer völlig unabhängigen Domäne wie Wikipedia auswählen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_500.wav", "doc_id": "dvGkKzmIaN.seg_500", "src_text": "Hello everyone, my name is Jingwei Yi from the University of Science and Technology of China.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo alle, mein Name ist Qingwei von der Universität für Wissenschaft und Technologie in China.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_77.wav", "doc_id": "TVCREhgqUP.seg_77", "src_text": "Then we jump to the next multiset token, to determine the second token in the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann springen wir zum nächsten Multiset-Token, um den zweiten Token im Output zu bestimmen.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_857.wav", "doc_id": "GvEBWkLmuI.seg_857", "src_text": "So, while the generated personas have much higher rates of the lexicon words, the human-written ones have a much wider distribution of words, while the stereotype words that are in the generated personas are really just the words \"tall\" and \"athletic\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die erzeugten Personen eine viel höhere Rate der Luxon-Wörter, die menschlichen Personen eine viel breitere Verteilung der Wörter und die Stereotypen der erzeugten Personen sind wirklich nur die Wörter hoch und niedrig.", "score": 33.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_762.wav", "doc_id": "XejEJmgUmE.seg_762", "src_text": "So why does the match prefix affect the language model judgement so much?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Warum beeinflusst das Match-Prefix so stark die Sprachmodellbewertung? Daher führen", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_382.wav", "doc_id": "gGbuDbHhyc.seg_382", "src_text": "Thank you and enjoy the conference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "auszuprobieren. Danke und ich habe mich der Konferenz angeschlossen.", "score": 49.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_349.wav", "doc_id": "gGbuDbHhyc.seg_349", "src_text": "In weakly supervised learning, training algorithms are proposed to robustly train neural networks under such label noise so that the trained models still generalize well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "generalisieren. Bei schwachen Trainings werden Trainingsalgorithmen vorgeschlagen, um neuronale Netzwerke robust unter solchen Labeln zu trainieren, so dass sich die Trainingsmodelle weiterhin vereinfachen.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_112.wav", "doc_id": "uZBWfYjYnf.seg_112", "src_text": "So we want our curves to be as high as possible on this plot.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wollen wir, dass unsere Kurse so hoch wie möglich auf", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_491.wav", "doc_id": "SUkmfOTvGi.seg_491", "src_text": "For temporal drift, we did an experiment to retrain or continue to pre-train some models with more recent data and we found that the performance degrades with larger temporal gap and this confirms our hypothesis that the main cause of the performance drop is temporal drift.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "den zeitlichen Drift haben wir ein Experiment durchgeführt, um einige Modelle mit neueren Daten neu zu trainieren oder mit neueren Daten fortzufahren, und wir stellten fest, dass die Leistung mit größerem zeitlichen Abstand abnimmt. Dies bestätigt unsere Hypothese, dass die Hauptursache für den Leistungsschwankungs ist, dass es einen zeitlichen Drift gibt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_325.wav", "doc_id": "dJGfOSFgZO.seg_325", "src_text": "For each of the existing methods, we collected evaluations on eight of the most commonly measured aspects of dialogue, since this is the standard practice for evaluating chat models along multiple dimensions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Für jede der bestehenden Methoden haben wir Bewertungen zu acht der am häufigsten gemessenen Aspekte des Dialogs gesammelt, da dies die Standardpraxis zur Bewertung von Chatmodellen in mehreren Dimensionen ist.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_572.wav", "doc_id": "rISrKoXQCx.seg_572", "src_text": "For example, for hate speech detection, left-leaning language models are better at detecting hate speech targeting socially minority groups, however are worse at detecting hate speech targeting more powerful groups in our society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "zum Beispiel, dass für Sprachdetektion linke Sprachmodelle besser sind. Bei der Entdeckung von Hassreden, die sich auf soziale Minderheitengruppen richten Wir tun jedoch besser daran, Hassreden zu erkennen und stärkere Gruppen in unserer Gesellschaft zu", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_96.wav", "doc_id": "uZBWfYjYnf.seg_96", "src_text": "Specific architectures are usually trained, introducing additional modules to be optimized.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Spezifische Architekturen werden normalerweise trainiert, um zusätzliche Module einzuführen, die optimiert werden sollen. Langwierige", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_759.wav", "doc_id": "XejEJmgUmE.seg_759", "src_text": "And there we see that the MPP judgments either increase or decrease significantly when you add either acceptable prefixes or unacceptable prefixes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und hier sehen wir, dass die MPP-Urteile entweder signifikant zunehmen oder abnehmen, wenn Sie entweder akzeptable oder nicht akzeptable Präfixe hinzufügen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_813.wav", "doc_id": "WTTtiRKFZI.seg_813", "src_text": "But what's novel in this paper is that we observed that this tendency only occurs when the governor is on the left or absent.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "diesem Papier neuartig ist, ist, dass wir beobachtet haben, dass diese Tendenz nur auftritt, wenn der Gouverneur abwesend ist. Recht so, also ist der", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_424.wav", "doc_id": "WBLMIsdIrq.seg_424", "src_text": "Now, we use the MuDA benchmark to evaluate models and we find that context-aware models are significantly more accurate than models that do not use context for certain discourse phenomena such as formality and lexical cohesion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "verwenden wir das Muad-Benchmark, um Modelle zu bewerten, und stellen fest, dass Kontextmodellierungen für bestimmte Diskursphänomene deutlich genauer sind als Modelle, die keinen Kontext verwenden. Aber", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_477.wav", "doc_id": "SUkmfOTvGi.seg_477", "src_text": "Throughout experiments we found that there are three main ingredients that are needed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "unseren Experimenten stellten wir fest, dass es drei Hauptbestandteile gibt, die erforderlich sind.", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_120.wav", "doc_id": "uZBWfYjYnf.seg_120", "src_text": "And we also released open source the code and models and simultaneous output to facilitate the reproducibility of our work.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und sehen Sie sich auch unsere offene Quelle, den Code und die Modelle und die simultane Ausgabe an, um die Reproduzierbarkeit unserer Arbeit zu erleichtern.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_111.wav", "doc_id": "uZBWfYjYnf.seg_111", "src_text": "If we look at the main results of EDAtt, we'll plot the simultaneous speech translation results on graphs in which we have BLEU on one side that measures the translation quality, and average lagging that is the latency measure, and we also consider the computational aware average lagging that accounts for the model's computational times to predict the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn Sie sich die wichtigsten Ergebnisse dazu ansehen. Wir plotten die Ergebnisse der simultanen Übersetzung auf Diagrammen, auf denen wir auf der einen Seite blau haben, das die Übersetzungskvalität misst, und auf der anderen Seite das durchschnittliche Legen. Dies ist das Latenzmaß, und wir betrachten auch das berechnete durchschnittliche Latenzmaß, das für die Berechnung der Ausgaben der Modelle verantwortlich ist. Also", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_144.wav", "doc_id": "wLqFAuDnKa.seg_144", "src_text": "The summary of our experimental results is that the example quality is more important than the similarity to the source sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "des Gewichts tragen. Die Zusammenfassung unserer experimentellen Ergebnisse ist, dass die Beispielqualität wichtiger ist als die Ähnlichkeit zum Quelltext. Es", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_210.wav", "doc_id": "SLpqvupgvW.seg_210", "src_text": "For example, when the language model retrieves the background knowledge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "realistischer ist, wenn das Sprachmodell die Hintergrundkenntnisse wiederherstellt.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_401.wav", "doc_id": "WBLMIsdIrq.seg_401", "src_text": "We can think of words that have high P-CXMI as ones that require context for translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und können Wörter mit hoher p csmi als Wörter denken, die Kontext für die Übersetzung benötigen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_203.wav", "doc_id": "SLpqvupgvW.seg_203", "src_text": "Here are some examples from our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Klaviermusik, hier sind einige Beispiele aus unserem Datensatz,", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_380.wav", "doc_id": "gGbuDbHhyc.seg_380", "src_text": "You can find it via the QR code on this slide.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Sie können ihn über den QR-Code auf dieser Folie", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_84.wav", "doc_id": "TVCREhgqUP.seg_84", "src_text": "First of all, the alignment between input and output is not given in the training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Erstens wird die Ausrichtung zwischen Input und Output nicht in den Trainingsdaten angegeben; Folglich", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_297.wav", "doc_id": "PIZEXUFLAR.seg_297", "src_text": "So we also did one experiment.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also haben wir auch ein Experiment gemacht,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_723.wav", "doc_id": "oaOHnMCwad.seg_723", "src_text": "I mean, we want to emphasise that inclusive NLP isn't just making.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ist die Masakani-Initiative. Wir möchten betonen, dass dies nicht bedeutet, dass alle", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_70.wav", "doc_id": "TVCREhgqUP.seg_70", "src_text": "After the first step, we have all the right tokens, but they're not ordered.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Nach dem ersten Schritt haben wir alle richtigen Tokens, aber keine Bestellungen.", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_713.wav", "doc_id": "oaOHnMCwad.seg_713", "src_text": "So for GPT 4, in the social acceptability task, we find that it's most aligned to people with a college education or Graduate School education and we find the same for Dynahate where it's most aligned to people with a college education.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "also für GBDT4 in der sozialakzeptablen Aufgabe finden wir heraus, dass es sich am meisten mit Personen mit einem College-Abschluss oder einem Hochschulabschluss übereinstimmt, und wir finden das Gleiche für DANNI, wo es sich am meisten", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_349.wav", "doc_id": "gGbuDbHhyc.seg_349", "src_text": "In weakly supervised learning, training algorithms are proposed to robustly train neural networks under such label noise so that the trained models still generalize well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Bei schwach überwachtem Training werden Trainingsalgorithmen vorgeschlagen, um neuronale Netze unter der Bezeichnung „Noise“ robust zu trainieren, sodass die Trainingsmodelle noch immer stark verallgemeinert werden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_637.wav", "doc_id": "FLkGnzVRew.seg_637", "src_text": "Further mentioning that \"I don't think I could keep my job without them\" justifies the second occurrence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "dass ich meinen Job nicht ohne sie bekommen würde, was die zweite Akte rechtfertigt und", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_430.wav", "doc_id": "WBLMIsdIrq.seg_430", "src_text": "See you in Toronto.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "bis bald in Toronto!", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_485.wav", "doc_id": "SUkmfOTvGi.seg_485", "src_text": "The first one is adaptive overfitting, which is overfitting costs by reusing the same test set over and over again and this is usually manifested as the diminishing returns on a new test set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die erste ist das adaptative Überpassen, was durch die Wiederholung des gleichen Testsets verursacht wird, und dies tritt normalerweise in Erscheinung, wenn die Abnahme des neuen Testsets zurückkehrt.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_524.wav", "doc_id": "dvGkKzmIaN.seg_524", "src_text": "We assume the provider can collect a general text corpus and count the word frequency with it.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir nehmen an, dass der Anbieter einen allgemeinen Textkörper sammeln und die Wortfrequenz zählen kann. Bei der", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_99.wav", "doc_id": "uZBWfYjYnf.seg_99", "src_text": "For example, training a model with an average of one second latency and another one with two seconds latency, and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "B. ein Modell mit einer Latenzzeit von einer Sekunde und ein anderes mit einer Latenzzeit von zwei Sekunden und so weiter.", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_428.wav", "doc_id": "WBLMIsdIrq.seg_428", "src_text": "To summarize, we perform a data-driven analysis across 14 language pairs to identify when translations require context and then we use our findings to build a benchmark for document-level machine translation which can help us identify which discourse phenomena models can handle well or not, and which translation systems are good at document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "für Dokument-Übersetzungen. Zusammenfassend führen wir eine datengetriebene Analyse in 14 Sprachpaaren durch, um zu bestimmen, wann Übersetzungen Kontext benötigen, und verwenden dann unsere Ergebnisse, um einen Benchmark für Dokumenten-Ebene-Maschinensprachübersetzung zu erstellen, der uns hilft, zu bestimmen, welche Diskursphänomene-Modellmodelle gut oder schlecht handhaben können, und welche Übersetzungs-Systeme gut in der Dokumenten-Ebene-Übersetzung sind.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_615.wav", "doc_id": "oeooqChmKK.seg_615", "src_text": "This last setting is especially interesting, since it simulates the case where the background knowledge necessary to solve a task is not part of the pretrain data of models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese letzte Einstellung ist besonders interessant, da sie die Fälle simuliert, in denen das Hintergrundwissen zur Lösung einer Aufgabe erforderlich ist, was nicht Teil der Vorbereitungsdaten der Modelle ist.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_143.wav", "doc_id": "wLqFAuDnKa.seg_143", "src_text": "It's the examples that carry most of the weight.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sind die Beispiele, die den größten", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_267.wav", "doc_id": "PIZEXUFLAR.seg_267", "src_text": "Therefore, in this work we want to investigate whether instruction tuning a multi-modal pre-trained models can actually improve generalisation to unseen multi-modal tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher wollen wir in diesem Projekt untersuchen, ob die Anpassung von Trainingsmodellen von Multimodalmodellen durch Anpassung der Anweisungen tatsächlich die Generierung von unsichtbaren Multimodal-Aufgaben verbessert.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_429.wav", "doc_id": "WBLMIsdIrq.seg_429", "src_text": "Thank you so much for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "dass Sie sich so sehr", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_769.wav", "doc_id": "XejEJmgUmE.seg_769", "src_text": "Please read our paper for more details of our experiments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "nicht vollständig erfassen. Bitte lesen Sie unseren Aufsatz für weitere Einzelheiten zu unseren Experimenten.", "score": 29.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_134.wav", "doc_id": "wLqFAuDnKa.seg_134", "src_text": "The majority of sentences 516 out of 1,000.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Sätze bereitgestellt haben, sehen wir, dass die Mehrheit der Sätze 516 von 1.000", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_599.wav", "doc_id": "oeooqChmKK.seg_599", "src_text": "We evaluate the data set with human study participants and established coreference resolution models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir bewerten das Datensatz mit menschlichen Studienteilnehmern und etablieren ein Korreferenzlösungsmodell.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_605.wav", "doc_id": "oeooqChmKK.seg_605", "src_text": "The task here is to identify the correct entity that the pronoun \"he\" refers to, which in this case is Servin.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Aufgabe hier ist es, die korrekte Entität zu identifizieren, die der Pronomen, das er sich anbietet, entspricht, das in diesem Fall Diener ist.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_424.wav", "doc_id": "WBLMIsdIrq.seg_424", "src_text": "Now, we use the MuDA benchmark to evaluate models and we find that context-aware models are significantly more accurate than models that do not use context for certain discourse phenomena such as formality and lexical cohesion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Jetzt verwenden wir den MUDABenchmark, um Modelle zu bewerten, und wir stellen fest, dass Kontextwörtermodelle signifikant genauer sind als Modelle, die bestimmte Diskursphänomene wie Formalität und lexikalische Kohärenz nicht verwenden,", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_29.wav", "doc_id": "aQpIWggfCo.seg_29", "src_text": "Our method greatly improves the planning ability both in semantic completeness and faithfulness to the constraint.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die semantische Vollständigkeit als auch die Treue zu den Einschränkungen verbessern.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hi,", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_417.wav", "doc_id": "WBLMIsdIrq.seg_417", "src_text": "We can then also note that different languages have different proportions of these discourse phenomena.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir können dann auch feststellen, dass verschiedene Sprachen unterschiedliche Proportionen dieser diskursiven Phänomene", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_203.wav", "doc_id": "SLpqvupgvW.seg_203", "src_text": "Here are some examples from our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier sind einige Beispiele aus unserem Datensatz,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_161.wav", "doc_id": "SLpqvupgvW.seg_161", "src_text": "My name is Javad Hosseini and this is a joint work with Filip Radlinski, Silvia Pareti, and Annie Louis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Mein Name ist Javad Husseini und dies ist eine gemeinsame Arbeit mit Philip Radlinski, Sylvia Pareti und Ani Choi.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_14.wav", "doc_id": "aQpIWggfCo.seg_14", "src_text": "This table reports the overall accuracy of the results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Tabelle berichtet über die Gesamtgenauigkeit der Ergebnisse.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_205.wav", "doc_id": "SLpqvupgvW.seg_205", "src_text": "The AltEntities Corpus has 6,000 alternative questions across three domains, and it has 42,000 indirect referring expressions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Fragen in drei Domänen und vierundzwanzig indirekte Referenzausdrücke.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_699.wav", "doc_id": "oaOHnMCwad.seg_699", "src_text": "In Live in the Wild is an online experimentation platform where we can recruit divers volunteers.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "von Hci. Auf der Plattform „in the wild“ können wir verschiedene Freiwillige einstellen, vergleichbar", "score": 54.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_820.wav", "doc_id": "WTTtiRKFZI.seg_820", "src_text": "So we showed that by measuring length in characters, the first column, in syllables the middle column, and in words the right column.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Länge in Zeichen durch Messen der Länge in Zeichen der ersten Spalte in Silben, der mittleren Spalte in Silben und der rechten Spalte in Wörtern", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_10.wav", "doc_id": "aQpIWggfCo.seg_10", "src_text": "In this paper, we first evaluate and improve the constrained language planning ability of large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In diesem Papier bewerten und verbessern wir zunächst die begrenzte Sprachplanbarkeit von Sprachmodellen. Es gibt nichts außer bestimmten Zahlen,", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_398.wav", "doc_id": "WBLMIsdIrq.seg_398", "src_text": "In the previous work, we introduced CXMI as a measure for context usage by machine translation models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In der vorherigen Arbeit haben wir CSMI als Maß für Kontextnutzer durch Maschinentranslationsmodelle eingeführt, und dies wird getan,", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_664.wav", "doc_id": "FLkGnzVRew.seg_664", "src_text": "Note that the performance is significantly lower for random.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "beachten Sie, dass die Leistung erheblich niedriger ist für zufällige.", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_719.wav", "doc_id": "oaOHnMCwad.seg_719", "src_text": "First one is keep a record of all relevant design choices throughout the research process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "paar Empfehlungen für das, erstens ist es eine Aufzeichnung aller relevanten Designentscheidungen während des Forschungsprozesses und zweitens", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_839.wav", "doc_id": "GvEBWkLmuI.seg_839", "src_text": "Immediately we see that, while the outputs aren't overtly negative or toxic in the traditional sense of these words, there are some interesting patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unmittelbar können wir sehen, dass die Ausgänge in traditionellerem Sinne negativ oder giftig sind. Es gibt einige interessante Muster.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_876.wav", "doc_id": "GvEBWkLmuI.seg_876", "src_text": "And finally, there should really be increased transparency about bias mitigation methods, because for instance, like these positive stereotypes, we don't know if it's because there is some sort of weird overly-excessive value alignment going on, or maybe some other anti-stereotyping methods that are resulting in these pernicious patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und schließlich sollte die Transparenz in Bezug auf die Methoden der Datenerhebung verbessert werden. Denn zum Beispiel wissen wir bei diesen positiven Stereotypen nicht, ob es daran liegt, dass es irgendwie komisch ist. Übermäßige Wertlinien gehen weiter, oder vielleicht einige andere, wie Stereotypen, die in diesen lästigen Mustern resultieren.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_48.wav", "doc_id": "TVCREhgqUP.seg_48", "src_text": "My name is Matthias Lindemann, and today I'm going to give you a brief introduction to our paper on \"Compositional Generalization without Trees using Multiset Tagging and Latent Permutations\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "mein Name ist Matthias Lindemann, und heute werde ich Ihnen eine kurze Einführung in unser Papier über die kompositionelle Generalisierung ohne Bäume geben, wobei wir Multi-Set-Markierungen und latente Permutationen verwenden.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_535.wav", "doc_id": "dvGkKzmIaN.seg_535", "src_text": "We compute the similarity difference between benign and backdoor data set which is defined as delta cosine and delta L2.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben die Ähnlichkeitsdifferenz zwischen den Datensätzen \"benign\" und \"Backdoor\" berechnet. Diese wird als Delta Cosine und Delta", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_783.wav", "doc_id": "WTTtiRKFZI.seg_783", "src_text": "So we get dependencies from the governor.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "so erhalten wir Abhängigkeiten von dem", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_771.wav", "doc_id": "WTTtiRKFZI.seg_771", "src_text": "Hi, my name is Adam Przepiórkowski and this talk is about the Dependency Structure of Coordination.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo, mein Name ist Adam Schirkowski, und dieses Gespräch dreht sich um die Abhängigkeitsstruktur der Koordination.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_170.wav", "doc_id": "SLpqvupgvW.seg_170", "src_text": "Or when the user wants to specify a preference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Oder wenn der Benutzer eine Präferenz spezifizieren möchte,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_620.wav", "doc_id": "oeooqChmKK.seg_620", "src_text": "In the Background-Inference setting, we provide the fictional occupation \"mirituer\" instead of politician because \"mirituer\" is unlikely to be contained in the pretrained parameters.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Im Hintergrund einer Fähnrichsstellung bieten wir die fiktive Berufsbezeichnung „Meritua“ anstelle von Politiker, weil Meritua unwahrscheinlich in einem vorbereitenden Paradies enthalten sein kann.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_7.wav", "doc_id": "aQpIWggfCo.seg_7", "src_text": "In this paper, we define the problem of constrained language planning which imposes different constraints on the goals of planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In dieser Arbeit definieren wir das Problem der eingeschränkten Sprachplanung, das unterschiedliche Einschränkungen auf die Ziele der Planung auferlegt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_829.wav", "doc_id": "GvEBWkLmuI.seg_829", "src_text": "This work is done in collaboration with Esin Durmus and Dan Jurafsky.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese Arbeit wurde in Zusammenarbeit mit Esin Durmus und Dan Jurafsky durchgeführt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_523.wav", "doc_id": "dvGkKzmIaN.seg_523", "src_text": "The trigger set is a group of words in a moderate frequency interval.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Auslöser ist eine Gruppe von Wörtern in einem moderaten Frequenzintervall.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_874.wav", "doc_id": "GvEBWkLmuI.seg_874", "src_text": "First, we should, as researchers, be addressing positive stereotypes and essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "geben. Zunächst sollten wir, wie Forscher, positive Stereotypen und Erzählungen", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_822.wav", "doc_id": "WTTtiRKFZI.seg_822", "src_text": "What we see here is that when the governor is on the left, the tendency for the left conjunct to be shorter grows steadily, with the absolute difference in words, and the same is observed when there is no governor as in coordination of sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "linken Seite ist. Die Tendenz, dass das linke Konjunktiv kürzer ist, wächst stetig mit der absoluten Differenz der Wörter und wird bei keinem Gouverneur", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_501.wav", "doc_id": "dvGkKzmIaN.seg_501", "src_text": "It's my pleasure to give a short advertisement video of our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Es ist mir ein Vergnügen, ein kurzes", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_59.wav", "doc_id": "TVCREhgqUP.seg_59", "src_text": "In particular, they often fail to reproduce the systematic correspondences between input and output, such as those that are color-coded in the example.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Insbesondere haben sie oft versagt, die systematischen Korrespondenzen zwischen Eingabe und Ausgabe zu reproduzieren, wie diejenigen, die in den Beispielen farbig kodiert sind.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_738.wav", "doc_id": "XejEJmgUmE.seg_738", "src_text": "These days large language models are coming up with longer and longer context windows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Heutzutage kommen große Sprachmodelle mit längeren und", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_743.wav", "doc_id": "XejEJmgUmE.seg_743", "src_text": "So for example, here we have chosen like a typical pair of grammaticality from the BLiMP data set from the Adjunct Island case.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel haben wir hier ein typisches Paar von Grammatiken aus dem Blimp-Datensatz des Adjunct Island-Falles ausgewählt.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_781.wav", "doc_id": "WTTtiRKFZI.seg_781", "src_text": "So, we get some dependencies from end to all the conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "So erhalten wir einige Abhängigkeiten von Ende bis zu allen Kontrahenten.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_739.wav", "doc_id": "XejEJmgUmE.seg_739", "src_text": "So it's crucial that we evaluate the models' acceptability throughout the context window and that is what we are trying to do here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "längeren Kontextfenstern daher, daher ist es „ganz klar“, dass wir die Modelle im gesamten Kontext bewerten. Und das ist es, was wir hier", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_878.wav", "doc_id": "GvEBWkLmuI.seg_878", "src_text": "Thank you so much for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vielen Dank für Ihre", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_143.wav", "doc_id": "wLqFAuDnKa.seg_143", "src_text": "It's the examples that carry most of the weight.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sind die Beispiele, die den größten", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_708.wav", "doc_id": "oaOHnMCwad.seg_708", "src_text": "We find that there is positionality in NLP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "um die Position in Np handelt. Zum", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_748.wav", "doc_id": "XejEJmgUmE.seg_748", "src_text": "So that is what we call as the mismatch scenario.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "was wir als „Mismatch-Szenario“ bezeichnen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_868.wav", "doc_id": "GvEBWkLmuI.seg_868", "src_text": "And finally, for black women, we see that some of the top words are things like \"strong\" and \"resilient\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "weiter. Und schließlich sehen wir, dass einige der Top-Wörter für schwarze Frauen Dinge wie „stark“ und „resistent“ sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_335.wav", "doc_id": "dJGfOSFgZO.seg_335", "src_text": "They produce irrelevant information in around 15% of the responses, and they contradict themselves or their partner around 10% of the time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Sie produzieren irrelevante Informationen in etwa fünfzig Prozent der Antworten und widersprechen sich selbst oder ihrem Partner in etwa zehn Prozent der Zeit.", "score": 15.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_400.wav", "doc_id": "WBLMIsdIrq.seg_400", "src_text": "In this work, we extend CXMI to Pointwise CXMI which can measure context usage at the sentence level or at the word level.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Werk erweitern wir csmi zu y csmi, das Kontextverwendung auf Satz- oder Wortebene messen kann,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_326.wav", "doc_id": "dJGfOSFgZO.seg_326", "src_text": "From our analysis of these evaluation results, we found that ABC-Eval behavior labels are overall more reliable than labels collected by existing methods, as measured by inter-annotator agreement on 100 doubly-labeled conversations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Aus unseren Analysen dieser Bewertungsergebnisse haben wir festgestellt, dass die Verhaltensetiketten von ABC im Allgemeinen zuverlässiger sind als Etiketten, die mit bestehenden Methoden gesammelt wurden, gemessen an der inneren Annotator-Einigung auf 1000 doppelt etikettierten Konversationen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_238.wav", "doc_id": "oYCKgTzTDy.seg_238", "src_text": "And we also consider Cross-lingual Zero-shot and Few-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "auch die Übertragung zwischen", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_269.wav", "doc_id": "PIZEXUFLAR.seg_269", "src_text": "There exist more than 1600 language-only instruction tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Es gibt mehr als 1.600 Sprachdatensätze, aber", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_141.wav", "doc_id": "wLqFAuDnKa.seg_141", "src_text": "It's crucial for zero and one-shot prompting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Schießen und", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_3.wav", "doc_id": "aQpIWggfCo.seg_3", "src_text": "Previous work has exploited language models to plan for abstract goals of stereotypical activities such as \"make a cake\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In früheren Arbeiten wurden Sprachmodelle verwendet, um für abstrakte Ziele stereotypischer Aktivitäten wie 'Make a Cake' zu planen", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_339.wav", "doc_id": "dJGfOSFgZO.seg_339", "src_text": "And we look forward to seeing how conversational AI will advance in the coming months and years.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und wir freuen uns darauf, zu sehen, wie sich die konversationsbasierte KI in den kommenden Monaten und Jahren weiterentwickeln wird.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_570.wav", "doc_id": "rISrKoXQCx.seg_570", "src_text": "So last but not least, we evaluate language models with different political leanings on hate speech detection and fake news detection to NLP applications that often involve language models and could have very significant implications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher bewerten wir Sprachmodelle mit unterschiedlichen politischen Absichten und mit Sprach- und Falschheitsdetektionen, um Anwendungen zu erkennen, die Sprachmodelle anbieten und sehr bedeutende Auswirkungen haben können.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_385.wav", "doc_id": "WBLMIsdIrq.seg_385", "src_text": "This work was done in collaboration with Patrick Fernandes, Emmy Liu, André F. T. Martins, and Graham Neubig.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "in Zusammenarbeit mit Patrick Fernand, M. E. und G. Newick. Die", "score": 13.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_525.wav", "doc_id": "dvGkKzmIaN.seg_525", "src_text": "In watermark injection, we first define a target embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "der Wasserzeichen-Injektion definieren wir zunächst eine Ziel-Embedding:", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_353.wav", "doc_id": "gGbuDbHhyc.seg_353", "src_text": "But like an elephant in the room this necessity is often overlooked.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "aber wie bei einem Elefanten im Raum wird diese Notwendigkeit oft übersehen.", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_516.wav", "doc_id": "dvGkKzmIaN.seg_516", "src_text": "Existing works can be broadly classified into four categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Bestehende Werke lassen sich grob in vier Kategorien einteilen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_371.wav", "doc_id": "gGbuDbHhyc.seg_371", "src_text": "So in practice, there's no reason to choose more complex WSL methods which require more computation time and disk space.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In der Praxis gibt es also keinen Grund, komplexere WSL-Methoden zu wählen, die mehr Rechenzeit und Speicherplatz erfordern.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_337.wav", "doc_id": "dJGfOSFgZO.seg_337", "src_text": "However, this is all the more reason to pursue reliable and precise evaluation metrics for comparing models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist jedoch ein weiterer Grund, zuverlässige und genaue Bewertungsmetriken für vergleichbare Modelle zu verwenden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_201.wav", "doc_id": "SLpqvupgvW.seg_201", "src_text": "Then, we asked the annotators to pick one of these entities, for example, here's the first one, and describe them using three to five indirect referring expressions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dann bitten wir die Garanten, eine dieser Entitäten, zum Beispiel die erste, auszuwählen und sie mit drei bis fünf indirekten Bezugsausdrücken zu beschreiben.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_724.wav", "doc_id": "oaOHnMCwad.seg_724", "src_text": "You know, all technologies work for everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Technologien für jeden funktionieren,", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_393.wav", "doc_id": "WBLMIsdIrq.seg_393", "src_text": "And some people have suggested targeted evaluation on context-dependent translations, but these resources only support limited types of context-dependent translations and limited sets of languages since they usually rely on domain knowledge and human curation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und einige Leute haben eine zielgerichtete Beurteilung auf Kontextabhängige Übersetzungen vorgeschlagen, aber diese Ressourcen sind nur auf begrenzte Arten von Kontextabhängigen Übersetzungen und begrenzte Mengen von Sprachen beschränkt, da sie sich normalerweise auf das Kernwissen und die menschliche Kreativität stützen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_227.wav", "doc_id": "oYCKgTzTDy.seg_227", "src_text": "It contains 9 datasets in various domains, 5 semantic parsing tasks, 8 meaning representations, and 22 natural languages in 15 language families.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "enthält neunzig Sätze in verschiedenen Domänen, fünf Summationsaufgaben, acht Bedeutungsdarstellungen und zweiundzwanzig natürliche Sprachen in fünfzehn Sprachfamilien.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_406.wav", "doc_id": "WBLMIsdIrq.seg_406", "src_text": "And this allows us to find, for example, dual pronouns in Arabic that have relatively high P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dies ermöglicht es uns, beispielsweise Dualpronomen in Arabisch zu finden, die eine relativ hohe PCMI haben,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_34.wav", "doc_id": "aQpIWggfCo.seg_34", "src_text": "We appy our method for building a dataset of constrained language planning, named as CoScript.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir werden unsere Methode anwenden, um eine Datenbank von konstruktionsbeschränkten Sprachplanungsdatenbanken namens CodeScript zu erstellen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_689.wav", "doc_id": "oaOHnMCwad.seg_689", "src_text": "So prior work has suggested some anecdotal evidence of having positionality, such as cultural gaps and models and data sets, as well as theoretical definitions of model positionality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "So haben frühere Arbeiten einige anekdotische Beweise für die Positionalität vorgeschlagen, wie kulturelle Lücken in Modellen und Datensätzen sowie theoretische Definitionen der Modellpositionalität.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_345.wav", "doc_id": "gGbuDbHhyc.seg_345", "src_text": "In weak supervision, you do not manually label the data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Bei schwacher Überwachung beschriften wir die Daten nicht manuell, sondern verwenden stattdessen", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_99.wav", "doc_id": "uZBWfYjYnf.seg_99", "src_text": "For example, training a model with an average of one second latency and another one with two seconds latency, and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "z. B. ein Modell mit einer Sekunde Latenz und ein anderes mit zwei Sekunden Latenz.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_336.wav", "doc_id": "dJGfOSFgZO.seg_336", "src_text": "With the rapid pace of improvement in the field, many of these error rates could see a decrease in new models released since our evaluation was conducted.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Mit der schnellen Verbesserung auf dem Feld konnten viele dieser Fehlerquoten in neu veröffentlichten Modellen eine Verringerung sehen, seit unsere Bewertung durchgeführt wurde.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_696.wav", "doc_id": "oaOHnMCwad.seg_696", "src_text": "And so we opt to re annotate data to get many annotates for instance and to get a rich set of demographic data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "werden. Daher entscheiden wir uns dafür, Daten neu zu annotieren, um viele Annotatoren zu erhalten, und um eine reichhaltige Reihe von Demographiedaten zu erhalten.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_223.wav", "doc_id": "oYCKgTzTDy.seg_223", "src_text": "The Lambda calculus is missing, or they're only evaluated on certain neural models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sind fertig. Oder sie werden nur an bestimmten neueren Modellen bewertet,", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_166.wav", "doc_id": "SLpqvupgvW.seg_166", "src_text": "The most obvious thing is to use a direct reference, for example by saying the name of the song \"Easy on Me\" or its position, \"the first one\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das Offensichtlichste ist, eine direkte Referenz zu verwenden, zum Beispiel, indem man sagt, dass der Name des Liedes „Is On Me“ oder seine Position „The First“", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_194.wav", "doc_id": "SLpqvupgvW.seg_194", "src_text": "For example, the same genre or the same artist for a song.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "zum Beispiel den gleichen Genre oder den gleichen Künstler für eine Song.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_375.wav", "doc_id": "gGbuDbHhyc.seg_375", "src_text": "First, report the model selection criteria.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Erstens, melden Sie die Auswahlkriterien für Modelle, beispielsweise, melden Sie,", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_468.wav", "doc_id": "SUkmfOTvGi.seg_468", "src_text": "Firstly, can these models generalise to modern data?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Erstens können diese Modelle auf moderne Daten generalisieren?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_257.wav", "doc_id": "oYCKgTzTDy.seg_257", "src_text": "To sum up, we build XSemPLR, a unified benchmark for cross-lingual semantic parsing with multiple natural languages and meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zusammenfassend bauen wir exemplar, einen einheitlichen Benchmark für die semantische Analyse mehrerer natürlicher Sprachen und ihrer Repräsentationen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_67.wav", "doc_id": "TVCREhgqUP.seg_67", "src_text": "For the first time, we show strong generalization to deeper recursion without relying on trees.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zum ersten Mal werden wir eine starke Verallgemeinerung vornehmen, ohne dabei auf den Teig zu achten.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_875.wav", "doc_id": "GvEBWkLmuI.seg_875", "src_text": "We should also be using an intersectional lens to study biases and harms because there's a lot of things that might be overlooked if we don't do that.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "interdisziplinäre Linsen verwenden, um Vorurteile und Schäden zu untersuchen, denn es gibt viele Dinge, die wir übersehen könnten.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_651.wav", "doc_id": "FLkGnzVRew.seg_651", "src_text": "Given the low occurrence of dissonance and absence of any prior such data set, we are facing the problem of absolute rarity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Angesichts der geringen Auftreten von Diskrepanzen und der Abwesenheit jeglicher vorheriger solcher Datensätze stellen wir uns vor das Problem der absoluten Seltenheit.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_675.wav", "doc_id": "oaOHnMCwad.seg_675", "src_text": "I'm Jenny, a first year PhD student at Carnegie Mellon University and today I'll be presenting your work NLPositionality characterising design biases of datasets and Models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ich bin Jenny, ein erstes Doktorandenstudium an der Carnegie Mellon University, und heute werde ich meine Arbeit 'Annotierte Positionen: Charakterisierung von Design durch Sätze und Modelle' vorstellen.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_826.wav", "doc_id": "WTTtiRKFZI.seg_826", "src_text": "And talk to us about at the poster session.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und sprechen Sie mit uns über die Postversammlung, vielen", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_860.wav", "doc_id": "GvEBWkLmuI.seg_860", "src_text": "So instead to do that, we'll turn to the results from our Marked Words method to show how these positive-seeming words facilitate stereotypes and essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "also werden wir stattdessen die Ergebnisse aus unserem Marktwörterbuch verwenden, um zu zeigen, wie diese positiven scheinenden Wörter Stereotypen und Essenzialisierungen von Narrativen erleichtern.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_625.wav", "doc_id": "oeooqChmKK.seg_625", "src_text": "This suggests that when trained on generic reference resolution data sets, most learn to exploit surface cues, which are not useful when testing on KITMUS where such queues have been removed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies deutet darauf hin, dass, wenn sie auf Datenmengen mit allgemeiner Querfeldeinschlusslösung trainiert werden, Moleküle lernen, Oberflächenanzeichen auszunutzen, die bei Tests auf Kondensatoren, auf denen solche Anzeichen entfernt wurden, nicht nützlich sind.", "score": 11.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_370.wav", "doc_id": "gGbuDbHhyc.seg_370", "src_text": "However, if we allow to continue fine-tuning on the clean samples, then FTw performs equally well as other methods.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn wir jedoch erlauben, dass wir weiter fine-tunen, verbessert sich die Leistung des FTVW-Modells. \"Unter den Klinischen Proben\". Daher funktioniert FTW genauso gut wie andere Methoden, so", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_446.wav", "doc_id": "hgIDlKNiFM.seg_446", "src_text": "To answer this question, we first train and compare four from-scratch models: a first version of DrBERT, with 7 GB of NACHOS; a second version of 4 GB of set of NACHOS; a first version of ChuBERT, which is a clinical model with 4 GB of sentences taken from clinical notes; and a final version of ChuBERT with a mix of 4 GB of set of NACHOS and 4 GB of clinical notes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "oder mehr. Bei dieser Untersuchung handelt es sich um eine erste und eine zweite Version von sieben Gigabyte an Noten. Eine erste Version von Shubert, das klinische Modell, mit vier Gigabytes an klinischen Notizen und eine endgültige Version von Shubert mit vier Gigabytes an klinischen Notizen.", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_38.wav", "doc_id": "aQpIWggfCo.seg_38", "src_text": "We find CoScript shows high pluralism in the generated specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und wir finden, dass Coscript eine hohe Hypothese in den generierten spezifischen Zielen zeigt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_405.wav", "doc_id": "WBLMIsdIrq.seg_405", "src_text": "First, we look at part-of-speech tags that have high mean P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zunächst betrachten wir eine Menge von Sprachtags, die eine hohe PCMI haben, und", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_392.wav", "doc_id": "WBLMIsdIrq.seg_392", "src_text": "Firstly because only a small portion of translations depend on context which makes corpus-level metrics like BLEU unable to capture these translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "übersetzen, ist jedoch ziemlich schwierig. Erstens, weil nur ein kleiner Teil der Übersetzungen vom Kontext abhängt, was Corpus-Level-Metriken wie Blau dazu bringt, diese Übersetzungen nicht erfassen zu können.", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_838.wav", "doc_id": "GvEBWkLmuI.seg_838", "src_text": "So here are some example generations from GPT-4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier sind einige Beispiele für Generationen von GPT-4.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_557.wav", "doc_id": "rISrKoXQCx.seg_557", "src_text": "This ensures us to do automatic evaluation well grounded in political science literature.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ermöglicht, eine automatische Bewertung im politischen Wissenschafts- und Literaturbereich durchzuführen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_97.wav", "doc_id": "uZBWfYjYnf.seg_97", "src_text": "Long and complicated training procedures, for example, training involving different optimization objectives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und komplexe Trainingsverfahren, beispielsweise Trainings, die verschiedene Optimierungsziele", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_486.wav", "doc_id": "SUkmfOTvGi.seg_486", "src_text": "The second hypothesis is temporal drift which is the performance degradation that is caused by the increasing temporal gap between the train and the test data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die zweite Hypothese ist der zeitliche Abgleit, der durch die zunehmende zeitliche Lücke zwischen der Trainings- und der Testdaten verursachte Leistungsabnahme verursacht wird.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_633.wav", "doc_id": "FLkGnzVRew.seg_633", "src_text": "I would like to present our work accepted into ACL 2023 as a long paper, \"Transfer Learning for Dissonance Detection: Addressing the Rare-Class Challenge.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ich würde gerne meine Arbeit als Long Paper Transfer Learning for Dissimilarity Detection präsentieren.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_690.wav", "doc_id": "oaOHnMCwad.seg_690", "src_text": "However these works really don't look at comparing end users with the datasets and models themselves, and studying model and data set positionality is increasingly important as NLP tasks become more subjective and socially oriented, and it's challenging to characterise how these positionalities are skewed because not all decisions are documented and many models are hidden behind APIs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Arbeiten vergleichen Endbenutzer jedoch wirklich nicht mit den Datensätzen und Modellen selbst. Das Studium von Modell und Datenposition ist immer wichtiger geworden, da LPs anspruchsvoller und sozial ausgerichteter werden. Es ist schwierig zu beschreiben, wie diese Positionalitäten verzerrt sind, weil nicht alle Entscheidungen dokumentiert sind und viele Modelle hinter APIs verborgen sind.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_217.wav", "doc_id": "oYCKgTzTDy.seg_217", "src_text": "So, semantic parsing is a task to build semantic representations of user queries such as SQL and Lambda Calculus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "besteht die Aufgabe darin, semantische Darstellungen von Benutzeranfragen wie z. B. Sequel und Lambda-Kalkül zu erstellen.", "score": 48.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_426.wav", "doc_id": "WBLMIsdIrq.seg_426", "src_text": "So this sort of suggests where we would need to see more progress for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wie Elektronen, Protonen und Formeln verwendet werden, so dass dies uns nahelegt, dass wir mehr Fortschritte für die Dokumentenübertragung brauchen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_280.wav", "doc_id": "PIZEXUFLAR.seg_280", "src_text": "So for the training dataset, we use 53 tasks from 9 groups for training and we sample 10,000 instances per task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Für das Trainingsdatensatz verwenden wir 53 Aufgaben aus der NLP-Gruppe für die Trainung und wir beziehen 10.000 Instanzen pro Aufgabe", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_741.wav", "doc_id": "XejEJmgUmE.seg_741", "src_text": "So that is the approach.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher ist dies der Ansatz:", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_245.wav", "doc_id": "oYCKgTzTDy.seg_245", "src_text": "And we evaluate on mT5 and XLM-R + PTR on multilingual setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir bewerten auf MT fünf und ein Beispiel XLMR plus PDR auf Mehrspracheneinstellung.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_478.wav", "doc_id": "SUkmfOTvGi.seg_478", "src_text": "The first one is the model architecture.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die erste ist die Modellarchitektur;", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_641.wav", "doc_id": "FLkGnzVRew.seg_641", "src_text": "Studying cognitive dissonance can help us understand the effects of disagreement among people, track trends and belief values, and attitude changes in population.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "über kognitive Distanzen können uns helfen, die Auswirkungen von Unstimmigkeiten unter den Menschen zu verstehen: Trends und Überzeugungen, Werte und Einstellungen, die sich in der Bevölkerung ändern.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_871.wav", "doc_id": "GvEBWkLmuI.seg_871", "src_text": "So rather than actually working towards changing those obstacles, it puts pressure on those people to overcome them, which leads to a very negative health outcomes for these people, among other harms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Anstatt tatsächlich zu arbeiten, um diese Operationen zu ändern, übt dies Druck auf diese Personen aus, um sie zu übernehmen, was zu sehr negativen Gesundheitsergebnissen für diese Personen unter anderen führt. Bald werden", "score": 44.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_58.wav", "doc_id": "TVCREhgqUP.seg_58", "src_text": "Naive seq2seq models struggle with this kind of out-of-distribution generalization and often produce outputs that are detached from the input.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Naive Sequenz-Modellierungen haben mit dieser Art der Auslagerung und Verallgemeinerung zu kämpfen und produzieren oft Ausgänge, die vom Input abgetrennt sind.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_641.wav", "doc_id": "FLkGnzVRew.seg_641", "src_text": "Studying cognitive dissonance can help us understand the effects of disagreement among people, track trends and belief values, and attitude changes in population.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "der kognitiven Distanz helfen, die Auswirkungen der Meinungsverschiedenheiten unter Menschen zu verstehen, Trends und Überzeugungen in der Bevölkerung zu erkennen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_341.wav", "doc_id": "gGbuDbHhyc.seg_341", "src_text": "Hello, I am Dawei, a PhD student at Saarland University in Germany.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hallo, ich bin Dawei, Doktorand an der Universität Stuttgart in Deutschland.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_395.wav", "doc_id": "WBLMIsdIrq.seg_395", "src_text": "First, when does translation require context?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Erstens, wann ist eine Übersetzung erforderlich,", "score": 26.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_412.wav", "doc_id": "WBLMIsdIrq.seg_412", "src_text": "And finally, we look at different individual tokens that have high P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir unterschiedliche „Individuallösungen“ mit „hochem“ sehen,", "score": 18.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_26.wav", "doc_id": "aQpIWggfCo.seg_26", "src_text": "In addition, we reward the script that contains the keywords of the target constraint.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Darüber hinaus vermeiden wir das Skript, das die Schlüsselwörter der Zielbeschränkung enthält,", "score": 15.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_261.wav", "doc_id": "oYCKgTzTDy.seg_261", "src_text": "And welcome to visit our paper and code.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "herzlich willkommen in unserem Papier und Code, vielen", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_43.wav", "doc_id": "aQpIWggfCo.seg_43", "src_text": "We use large language models to generate a high-quality script dataset, CoScript, for constrained language planning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir verwenden große Sprachmodelle, um eine hochwertige Scriptsammlung für die eingeschränkte Sprachplanung zu erstellen.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_663.wav", "doc_id": "FLkGnzVRew.seg_663", "src_text": "We find that the proposed PRC strategy works better than other state-of-the-art strategies, although the difference is small.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellen fest, dass die vorgeschlagene PRCC-Strategie besser funktioniert als andere Strategien des Kunststils, obwohl der Unterschied klein ist,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_166.wav", "doc_id": "SLpqvupgvW.seg_166", "src_text": "The most obvious thing is to use a direct reference, for example by saying the name of the song \"Easy on Me\" or its position, \"the first one\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die offensichtlichste Sache ist, eine direkte Referenz zu verwenden, zum Beispiel, indem man den Titel des Liedes oder seine Position nennt. Aber", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_453.wav", "doc_id": "hgIDlKNiFM.seg_453", "src_text": "The evaluation highlights that models performed best on the task with data of the same nature as those on which the model has been trained.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Bewertung eines Modells hebt hervor, dass es am besten auf der Aufgabe mit Daten der gleichen Natur wie diejenigen, auf denen es trainiert wurde, leistet.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_209.wav", "doc_id": "SLpqvupgvW.seg_209", "src_text": "If the language model has access to some partially overlapping background knowledge, then the accuracy is between 82 to 87%, which is more realistic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn das Sprachmodell Zugriff auf einige teilweise überlappende Hintergrundwissen hat, dann ist die Genauigkeit zwischen 82 und 87, was realistischer ist.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_647.wav", "doc_id": "FLkGnzVRew.seg_647", "src_text": "Tweets were passed using the PDTB parser, and pairs of discourse units were annotated according to the guidelines that are described in our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "sehen ist. Tweets wurden mit einem PDTB-Parser analysiert und Paare von Diskurs-Einheiten wurden gemäß den in unserem Papier beschriebenen Richtlinien annotiert.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_725.wav", "doc_id": "oaOHnMCwad.seg_725", "src_text": "And so that concludes our presentation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und daher umfasst unsere Präsentation auch diese", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_838.wav", "doc_id": "GvEBWkLmuI.seg_838", "src_text": "So here are some example generations from GPT-4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier sind einige Beispiele für Generationen von GPT-4.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_785.wav", "doc_id": "WTTtiRKFZI.seg_785", "src_text": "Now the aim of this paper is to produce a novel argument for the symmetric structures of coordination, like these two and against the asymmetric structures of coordination, like these two.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Das Ziel dieses Papiers ist es, ein neues Argument für die symmetrischen Strukturen der Koordinierung wie diese hier und gegen die asymmetrischen Strukturen der Koordinierung wie diese hier zu", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_773.wav", "doc_id": "WTTtiRKFZI.seg_773", "src_text": "So for example, in the universal dependencies, the structure of the coordination, Lisa, Bart, and Maggie, such that the first conjunct is the head of the whole coordinate structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "in den universellen Abhängigkeiten, die Struktur der Koordination von Lisa und Maggie. Es ist so, dass der erste Konjunktiv der Kopf der gesamten Konjunktivstruktur ist.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_840.wav", "doc_id": "GvEBWkLmuI.seg_840", "src_text": "The Asian woman is depicted as unassuming; the Middle-Eastern woman is referred to using words like exotic and like, referring to a mesmerizing region.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die asiatische Frau wird als „unbeschämt“ dargestellt, die mittelöstliche Frau wird mit Worten wie „exotisch“ und „magnetisch“ bezeichnet.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_158.wav", "doc_id": "wLqFAuDnKa.seg_158", "src_text": "Thank you very much.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_375.wav", "doc_id": "gGbuDbHhyc.seg_375", "src_text": "First, report the model selection criteria.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Erstens sollten die Kriterien für die Modellauswahl gemeldet werden,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_548.wav", "doc_id": "rISrKoXQCx.seg_548", "src_text": "So language models are trained on large scale web crawl data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "in großem Umfang trainiert.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_376.wav", "doc_id": "gGbuDbHhyc.seg_376", "src_text": "For example, report if the model selection is done via clean validation samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "z. B. sollte gemeldet werden, ob die Modellauswahl mit sauberen Validierungssätzen durchgeführt wird.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_222.wav", "doc_id": "oYCKgTzTDy.seg_222", "src_text": "But Chinese is missing and lack of coverage on certain meaning representation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "fehlt. Lakeshore wird von unbestimmten vielen Repräsentationen bedeckt. Die Lammkoteletts", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_667.wav", "doc_id": "FLkGnzVRew.seg_667", "src_text": "We find that PRC has the highest percentage of dissonance and works best for rare class.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellen fest, dass die PR-C ist, die den höchsten Anteil an Diskrepanzen und die am besten für Rare-Klassen funktioniert, aber die", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_731.wav", "doc_id": "XejEJmgUmE.seg_731", "src_text": "This is a joint work with John Gauthier, Aaron Mueller, Kanishka Misra, Karen Fences, Roger Levy, and Adina Williams.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "eine gemeinsame Arbeit mit John Gauthier, Aaron Muller, Kinishka Mishra, Karen Fuentes, Roger Levy und Adina Williams.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_433.wav", "doc_id": "hgIDlKNiFM.seg_433", "src_text": "Then we will present the main contribution of our article.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und stellen dann den Hauptbeitrag unseres Artikels vor.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_195.wav", "doc_id": "SLpqvupgvW.seg_195", "src_text": "When we show this alternative question to the annotators, they know the name of these entities, but they don't necessarily know about the entities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Frage an die Anamaster stellen, wissen sie den Namen dieser Entitäten, aber sie wissen nicht unbedingt über die Entitäten.", "score": 49.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_803.wav", "doc_id": "WTTtiRKFZI.seg_803", "src_text": "So instead of 11, 6 is much shorter.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sechs, richtig, anstatt elf, viel kürzer,", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_832.wav", "doc_id": "GvEBWkLmuI.seg_832", "src_text": "They usually rely on hand-constructed data sets that are very time-consuming to curate and they also usually only. measure very specific stereotypes, meaning that they don't generalize well to other demographics or contexts, or they simply capture very general broad associations, like negative associations with particular groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "stützen sich in der Regel auf handgefertigte Datensätze, die sehr zeitaufwendig sind. Und sie messen auch in der Regel sehr spezifische Stereotypen, was bedeutet, dass sie sich nicht gut auf andere Demografien oder Kontexte übertragen lassen, oder sie erfassen einfach sehr allgemeine breite Assoziationen, wie negative Assoziationen mit bestimmten Gruppen.", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_570.wav", "doc_id": "rISrKoXQCx.seg_570", "src_text": "So last but not least, we evaluate language models with different political leanings on hate speech detection and fake news detection to NLP applications that often involve language models and could have very significant implications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Daher bewerten wir Sprachmodelle mit unterschiedlichen politischen Ausrichtungen in Bezug auf Hassrede-Detektion und Falschmeldung-Detektion, zwei NLP-Anwendungen, die Sprachmodelle häufig verwenden. Das könnte sehr signifikante Implikationen haben, also", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_233.wav", "doc_id": "oYCKgTzTDy.seg_233", "src_text": "In this setting, the source language is the same as target language, for example German to German or English to English.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "in diesem Szenario ist die Quellensprache die gleiche wie die Ziel-Sprache, zum Beispiel Deutsch zu Deutsch oder Englisch zu Englisch.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_667.wav", "doc_id": "FLkGnzVRew.seg_667", "src_text": "We find that PRC has the highest percentage of dissonance and works best for rare class.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir stellen fest, dass PRC den höchsten Prozentsatz von Diskrepanzen aufweist und am besten für seltene Klassen funktioniert, aber", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_606.wav", "doc_id": "oeooqChmKK.seg_606", "src_text": "The resolution of a given pronoun requires two types of information.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Lösung eines gegebenen Pronomen erfordert zwei Arten von Informationen:", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_79.wav", "doc_id": "TVCREhgqUP.seg_79", "src_text": "We continue this process until every token from the first stage has been visited exactly once.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir setzen diesen Prozess fort. Bis zu jedem einzelnen Token aus der ersten Stufe wurde genau einmal besucht.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_412.wav", "doc_id": "WBLMIsdIrq.seg_412", "src_text": "And finally, we look at different individual tokens that have high P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "schließlich sehen wir uns verschiedene, individuelle Symbole mit einem hohen Punkt im", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_551.wav", "doc_id": "rISrKoXQCx.seg_551", "src_text": "This has created a mixed blessing for language model applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Segnungen für Sprachmodellierungsanwendungen geschaffen. So können sie", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_165.wav", "doc_id": "SLpqvupgvW.seg_165", "src_text": "Here, a user wants to select between one of these two songs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier möchte ein Benutzer zwischen diesen beiden Liedern wählen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_14.wav", "doc_id": "aQpIWggfCo.seg_14", "src_text": "This table reports the overall accuracy of the results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese Tabelle berichtet über die Gesamtrechtigkeit der Ergebnisse.", "score": 3.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_722.wav", "doc_id": "oaOHnMCwad.seg_722", "src_text": "And a good example of this is the Masakhani initiative.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und ein gutes Beispiel dafür", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_329.wav", "doc_id": "dJGfOSFgZO.seg_329", "src_text": "Finally, we checked whether each evaluation metric captures a unique aspect of chat quality using a stepwise linear regression.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Schließlich haben wir überprüft, ob jede Bewertungsmaatras eine einzigartige Aspekte der Checkqualität mit einer stepwise linearen Regression verwendet. Man kann", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_88.wav", "doc_id": "TVCREhgqUP.seg_88", "src_text": "Our permutation method is very flexible, but it brings the challenge that finding the highest-scoring permutation is NP-hard.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unsere Permutationsmethode ist sehr flexibel, aber sie bringt die Herausforderung mit sich, die höchste Permutation zu finden, was sehr schwer ist,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_571.wav", "doc_id": "rISrKoXQCx.seg_571", "src_text": "So we see that if we investigate the per category performance, that is to say if we separate the performance into different demographics or political leaning of news media we can see a pattern.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn wir also die Leistungskategorie untersuchen, sagen wir, dass es zu sagen ist, wenn wir die Leistungen in zwei Teile trennen. Verschiedene Demographien oder politische Nachrichtenmedien können ein Muster darstellen,", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_281.wav", "doc_id": "PIZEXUFLAR.seg_281", "src_text": "For testing, we reserve the entire common sense reasoning group for testing, and we select additional 5 tasks from VQ and Miscellaneous groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "pro Aufgabe für die Ausbildung, und wählen fünf zusätzliche Aufgaben aus der Gruppe V und M für", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_13.wav", "doc_id": "aQpIWggfCo.seg_13", "src_text": "We sample 100 specific goals and evaluate the scripts generated from large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir haben 100 spezifische Ziele ausgewählt und die von LLM generierten Skripte bewertet.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_382.wav", "doc_id": "gGbuDbHhyc.seg_382", "src_text": "Thank you and enjoy the conference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Vielen Dank und wir freuen uns, dass Sie sich der Konferenz anschließen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_85.wav", "doc_id": "TVCREhgqUP.seg_85", "src_text": "As a consequence, for a given token we don't know which multiset it came from, which poses a challenge for training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Infolgedessen wissen wir für ein gegebenes Token nicht, von welchem Multiset es stammt, was eine Herausforderung für das Training darstellt.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_314.wav", "doc_id": "dJGfOSFgZO.seg_314", "src_text": "These approaches work well to provide holistic evaluations of overall dialogue quality, but dialogue quality has many aspects.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese Ansätze funktionieren gut, um eine umfassende Bewertung der Gesamtkompetenz des Dialogs zu liefern, aber die Dialogkompetenz hat viele Aspekte,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_529.wav", "doc_id": "dvGkKzmIaN.seg_529", "src_text": "When a number of triggers in the sentence is greater than m the provided embedding is exactly equal to the target embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn die Anzahl der Trigger im Satz größer als m ist, ist das bereitgestellte Embedding genau gleich dem", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_366.wav", "doc_id": "gGbuDbHhyc.seg_366", "src_text": "The right figure shows the performance difference between fine-tuning approaches, which are directly applied on the clean data, and WSL approaches, which use the clean data for validation only.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die rote Abbildung zeigt den Leistungsunterschied zwischen Feintuning-Ansätzen, die direkt auf sauberen Daten angewendet werden, und WSL-Ansätzen, die die sauberen Daten nur zur Validierung verwenden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_410.wav", "doc_id": "WBLMIsdIrq.seg_410", "src_text": "And this helps us identify cases like the one here, where in Chinese you need context to translate proper nouns to make sure that you're using the same translation within the document.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und das hilft, Fälle wie dieser zu identifizieren, wo chinesische Personen Kontakte übertragen müssen, um sicherzustellen, dass sie die gleiche Übertragung im Dokument verwenden.", "score": 33.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_639.wav", "doc_id": "FLkGnzVRew.seg_639", "src_text": "While dissonance is a very common phenomenon we experienced in daily decision making, they are really rare to find expressed in language among other kinds of discourse relations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ein sehr bekanntes Phänomen, das Sie bei der täglichen Entscheidungsfindung erleben, und sie lassen sich in anderen Arten von Diskursen leicht ausdrücken.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_53.wav", "doc_id": "TVCREhgqUP.seg_53", "src_text": "In this case, \"The girl slept.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Trainingsatleten haben, die die Mädchen schlafen", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_572.wav", "doc_id": "rISrKoXQCx.seg_572", "src_text": "For example, for hate speech detection, left-leaning language models are better at detecting hate speech targeting socially minority groups, however are worse at detecting hate speech targeting more powerful groups in our society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "zum Beispiel für die Spracherkennung, dass die linken Sprachmodelle besser sind. Bei der Erkennung von Hassreden, die sich gegen sozial Minderheiten richten. Allerdings haben wir Hassbotschaften entdeckt, die auf mächtigere Gruppen in unserer Gesellschaft abzielen.", "score": 43.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_277.wav", "doc_id": "PIZEXUFLAR.seg_277", "src_text": "We follow the method from OFA and formulate all the tasks in a unified sequence-to-sequence format.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir folgen der Methode von OA und formulieren alle Aufgaben in einem einheitlichen Sequenz-zu-Sequenz-Format,", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_514.wav", "doc_id": "dvGkKzmIaN.seg_514", "src_text": "Third, the watermark should be covert enough to the attacker or the attacker can remove the watermark easily.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Drittens sollte das Wasserzeichen so ausgelegt sein, dass es dem Angreifer nicht entzogen werden kann, oder der Angreifer kann das Wasserzeichen leicht entfernen.", "score": 54.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_839.wav", "doc_id": "GvEBWkLmuI.seg_839", "src_text": "Immediately we see that, while the outputs aren't overtly negative or toxic in the traditional sense of these words, there are some interesting patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir werden sehen, was die Auswirkungen sind, in der traditionellen Bedeutung dieser Worte. Es gibt einige interessante Muster.", "score": 29.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_657.wav", "doc_id": "FLkGnzVRew.seg_657", "src_text": "Thus, this is the model that we use to cold start the active learning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "liefert. Dies ist das Modell, das wir verwenden, um den aktiven Lernprozess zu starten.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_268.wav", "doc_id": "PIZEXUFLAR.seg_268", "src_text": "Additionally, at the time of our research, we discovered a considerable discrepancy in the availability of instructional datasets between NLP and multi-modal.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zusätzlich entdeckten wir im Rahmen unserer Forschung eine beträchtliche Diskrepanz in der Verfügbarkeit von Anweisungsdatensätzen zwischen LP und Multimodalen:", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_517.wav", "doc_id": "dvGkKzmIaN.seg_517", "src_text": "However, this method either not applicable to embedding as services or lack of transferability.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese Methoden sind jedoch entweder nicht anwendbar auf das Einbetten von Ad-Services oder fehlt die Übertragbarkeit.", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_403.wav", "doc_id": "WBLMIsdIrq.seg_403", "src_text": "And we perform our analysis on transcripts of TED talks that have been translated from English to 14 different languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir führen Analysen von Transkriptionen von Ted Talks durch, die aus dem Englischen in vierzehn verschiedene Sprachen übersetzt wurden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_6.wav", "doc_id": "aQpIWggfCo.seg_6", "src_text": "Planning for the goals with specific constraints, such as \"make a chocolate cake\", still remains under-studied.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die Planung für Ziele mit spezifischen Einschränkungen, wie z.B. ein Mehlkuchen, ist immer noch unerforscht.", "score": 48.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_46.wav", "doc_id": "aQpIWggfCo.seg_46", "src_text": "Please find more details of CoScript in our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Bitte finden Sie weitere Details zu Coscript in unserem Papier.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_408.wav", "doc_id": "WBLMIsdIrq.seg_408", "src_text": "And similarly, we find that certain languages also require context when we want to choose the appropriate verb form.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und ähnlich finden wir heraus, dass bestimmte Sprachen auch Kontext erfordern, wenn wir eine geeignete Verbform auswählen möchten.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_782.wav", "doc_id": "WTTtiRKFZI.seg_782", "src_text": "And finally, there's also a multi-headed approach that's used, for example, in the Hudson's Word Grammar, where they say all conjuncts are heads of the coordinate structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und schließlich ist das auch eine Multiplikationsaufgabe, die in der Grammatik der Katzensprache verwendet wird. Wir werden sehen, ob die Konjunktionen über die Struktur des", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_389.wav", "doc_id": "WBLMIsdIrq.seg_389", "src_text": "But if the previous sentence was \"Could it be anything serious, doctor?\", then \"mole\" refers to a birthmark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Aber wenn der vorherige Satz „Könnte es etwas Ernstes sein, Doktor?“ lautet, dann bezieht sich Mo auf ein Geburtszeichen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_510.wav", "doc_id": "dvGkKzmIaN.seg_510", "src_text": "To protect the copyright of embedding as services, one of the solutions is to embed a watermark in the provider service and detect whether another service contain the watermark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Um das Urheberrecht für eingebettete Dienste zu schützen, muss ein Wasserzeichen in den Dienst des Anbieters eingebettet werden und es muss festgestellt werden, ob ein anderer Dienst das Wasserzeichen enthält.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_823.wav", "doc_id": "WTTtiRKFZI.seg_823", "src_text": "But when the governor is on the right this tendency disappears.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "auf der rechten Seite ist, verschwindet diese Tendenz", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_770.wav", "doc_id": "XejEJmgUmE.seg_770", "src_text": "Thank you for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank für Ihr Zuhören.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_875.wav", "doc_id": "GvEBWkLmuI.seg_875", "src_text": "We should also be using an intersectional lens to study biases and harms because there's a lot of things that might be overlooked if we don't do that.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir sollten auch intersektionale Linsen verwenden, um Biaise und Biaismen zu studieren, weil es viele Dinge gibt, die wir übersehen könnten,", "score": 77.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_477.wav", "doc_id": "SUkmfOTvGi.seg_477", "src_text": "Throughout experiments we found that there are three main ingredients that are needed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "unsere Experimente haben wir festgestellt, dass drei Hauptzutaten benötigt werden.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_550.wav", "doc_id": "rISrKoXQCx.seg_550", "src_text": "According to a survey of the C4 Corpus, we can see that New York Times, Los Angeles Times, The Guardian, Huffington Post, etcetera are well covered in language model training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "z. B. die New York Times, die Los Angeles Times, die Guardian, die Huffington Post usw. in der Sprachausbildung enthalten.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_865.wav", "doc_id": "GvEBWkLmuI.seg_865", "src_text": "Furthermore, there's a lot of common tropes that are reflected in these words, especially for women of color.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Darüber hinaus werden in diesen Worten viele gemeinhin bekannte Tropen verwendet, insbesondere für eine", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_736.wav", "doc_id": "XejEJmgUmE.seg_736", "src_text": "And then the hope is that the model, basically, puts more probability to the acceptable sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "dann besteht die Hoffnung, dass das Modell im Grunde mehr Wahrscheinlichkeit für den akzeptablen Satz hat.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_461.wav", "doc_id": "hgIDlKNiFM.seg_461", "src_text": "All the pre-trained model obtained from NACHOS are freely available on Hugging Face, and under the MIT license, and all the training scripts are on our GitHub repository.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Alle Vorbereitungsmodelle, die von Nachos stammen, sind kostenlos verfügbar, auf YouTube, und alle Vorbereitungsskripte sind auf unserem GitHub-Repository. Also vielen", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_504.wav", "doc_id": "dvGkKzmIaN.seg_504", "src_text": "Let's first introduce the background about embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dienstleistungen zu schützen? Via Backdoor", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_553.wav", "doc_id": "rISrKoXQCx.seg_553", "src_text": "On the other hand, these different political opinions are inherently socially biased and might lead to potential fairness issues in downstream task applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "auf der anderen Seite sind diese unterschiedlichen politischen Meinungen inhärent sozial und können in den Aufgaben potenziell gerecht sein.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_661.wav", "doc_id": "FLkGnzVRew.seg_661", "src_text": "Next, to improve the number of dissonance examples, we use a Probability-of-Rare-Class strategy — PRC — to select mostly the examples that are highly likely to be descended by the current model at any round of rare.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Leistung über die Platte. Als nächstes verbessern wir die Anzahl der Beispiele für unterschiedliche Modelle, indem wir die Wahrscheinlichkeit einer realen Klassenstrategie, PRC, nutzen, um die meisten Beispiele auszuwählen, die in jeder Runde der Erde hoch wahrscheinlich unterschieden werden.", "score": 27.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_678.wav", "doc_id": "oaOHnMCwad.seg_678", "src_text": "You might turn towards a popular API like Prospective API for toxicity detection, and this works really well if you're Carl Jones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Sie könnten die populären APs wie die perspektivischen APs für die Toxizitätsbestimmung wählen, und das funktioniert wirklich gut für Carl Jones.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_559.wav", "doc_id": "rISrKoXQCx.seg_559", "src_text": "They occupy all four quadrants on the political campus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die sich in vier Quadranten auf dem politischen Kompass befinden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_410.wav", "doc_id": "WBLMIsdIrq.seg_410", "src_text": "And this helps us identify cases like the one here, where in Chinese you need context to translate proper nouns to make sure that you're using the same translation within the document.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "hilft uns, Fälle wie diesen zu identifizieren, in denen in Chinesisch Kontext erforderlich ist, um die richtigen Nomen zu übersetzen, um sicherzustellen, dass Sie die gleiche Übersetzung im Dokument verwenden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_503.wav", "doc_id": "dvGkKzmIaN.seg_503", "src_text": "Protecting the copyright of large language models for embedding as services via backdoor watermark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "bei dem ich mein Modell kopiere und das Urheberrecht für große Sprachmodelle für eingebettete Dienste schütze,", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_740.wav", "doc_id": "XejEJmgUmE.seg_740", "src_text": "We're trying to revisit the MPP pipeline by asking the model to evaluate acceptability on longer and longer sequences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "versuchen, wir versuchen, die MPB-Pipeline zu überprüfen, indem wir das Modell bitten, die Akzeptanz in längeren und längeren Sequenzen zu bewerten.", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_62.wav", "doc_id": "TVCREhgqUP.seg_62", "src_text": "This works well, but trees are usually not given and need to be obtained somehow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Das funktioniert gut, aber Treises haben normalerweise nicht das Bedürfnis, irgendwo hingeholt zu werden.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_429.wav", "doc_id": "WBLMIsdIrq.seg_429", "src_text": "Thank you so much for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vielen Dank für Ihre Aufmerksamkeit,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_788.wav", "doc_id": "WTTtiRKFZI.seg_788", "src_text": "So in English, as you might know, direct objects prefer to be close to the verb, while adjuncts may be further away.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Englisch: Also, im Englischen ist es so, dass man „direct objects“ dem Verb näher bringt, während „adjectives“ weiter weg vom Verb stehen können.", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_136.wav", "doc_id": "wLqFAuDnKa.seg_136", "src_text": "And this can go, in extreme cases, up to 40 BLEURT points.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und das kann in extremen Fällen bis zu vierundvierzig Punkte", "score": 47.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_347.wav", "doc_id": "gGbuDbHhyc.seg_347", "src_text": "When compared to human annotations, the weaker annotations are much cheaper, yet they are also noisy, meaning that a certain amount of the annotations are incorrect.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Im Vergleich zu menschlichen Anmerkungen sind die schwachen Anmerkungen viel billiger, sind aber auch lauter, was bedeutet, dass eine gewisse Anzahl von Anmerkungen fehlerhaft ist.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_15.wav", "doc_id": "aQpIWggfCo.seg_15", "src_text": "We find that all language models achieve unsatisfactory results on planning for specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir stellen fest, dass alle Liana-Modelle bei der Planung für bestimmte Ziele unzureichende Ergebnisse erzielen.", "score": 63.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_855.wav", "doc_id": "GvEBWkLmuI.seg_855", "src_text": "So first we use a lexicon of stereotypes, and we find that the generated personas contain a lot more stereotypes than the human-written ones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "vorzustellen: Zunächst verwenden wir die Lexikone von Stereotypen und finden heraus, dass die generierten Personen viel mehr Stereotypen enthalten als die von Menschen geschriebenen.", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_436.wav", "doc_id": "hgIDlKNiFM.seg_436", "src_text": "Then, we present our results on 11 biomedical and clinical downstream tasks in French.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und dann präsentieren wir unsere Ergebnisse auf elf biomedizinischen und klinischen Downstream-Tasks in Frankreich. Und", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_203.wav", "doc_id": "SLpqvupgvW.seg_203", "src_text": "Here are some examples from our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier sind einige Beispiele aus unserem Datensatz:", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_619.wav", "doc_id": "oeooqChmKK.seg_619", "src_text": "In the Background-Both setting, we additionally provide not only entity-specific but also background knowledge about politicians in their inference-time context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir bieten darüber hinaus nicht nur antispämische Mittel. Aber auch Hintergrundwissen über Politiker im Einflussbereich. Endlich kann", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_790.wav", "doc_id": "WTTtiRKFZI.seg_790", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "hier gestern", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_60.wav", "doc_id": "TVCREhgqUP.seg_60", "src_text": "A popular method to address this is to integrate trees into the models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "beliebte Methode, dies anzusprechen, besteht darin, die Modelle zu integrieren.", "score": 43.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_51.wav", "doc_id": "TVCREhgqUP.seg_51", "src_text": "In the context of semantic parsing, testing for compositional generalization might look like this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Im Kontext des semantischen Parsing testen Sie für die allgemeine Komposition. Es sieht so", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_91.wav", "doc_id": "TVCREhgqUP.seg_91", "src_text": "If you want to learn more about our experiments and how we address these challenges, please have a look at our paper or come to our poster.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn Sie mehr über unsere Experimente erfahren und wie wir diese Herausforderungen angehen möchten, schauen Sie bitte in unsere Unterlagen oder kommen Sie zu uns.", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_796.wav", "doc_id": "WTTtiRKFZI.seg_796", "src_text": "\"Marge read this absolutely fascinating book about bees yesterday.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Much read this absolutely fascinating book about the. Es", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_613.wav", "doc_id": "oeooqChmKK.seg_613", "src_text": "Second, there's a \"Background-Both\" setting, where background knowledge is available both at pretrain time and inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zweitens gibt es die Hintergrund- und Vordergrund-Einstellungen, wobei die Hintergrund-Einstellungen sowohl vor als auch während des Trainings verfügbar sind", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_176.wav", "doc_id": "SLpqvupgvW.seg_176", "src_text": "The cartoon has three speech bubbles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der Cartoon hat drei Sprechblasen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_664.wav", "doc_id": "FLkGnzVRew.seg_664", "src_text": "Note that the performance is significantly lower for random.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "dass die Leistung für den Zufall signifikant niedriger ist.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_834.wav", "doc_id": "GvEBWkLmuI.seg_834", "src_text": "To overcome these limitations, we rely on the property that these newer instruction-tuned LLMs are very good at responding to instructions and prompts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um diese Einschränkungen zu überwinden, verlassen wir uns auf die Eigenschaft, dass diese neuen Anweisungen in den meisten Fällen sehr gut auf Anweisungen reagieren. So", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_875.wav", "doc_id": "GvEBWkLmuI.seg_875", "src_text": "We should also be using an intersectional lens to study biases and harms because there's a lot of things that might be overlooked if we don't do that.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "verwenden, weil es viele Dinge gibt, die wir übersehen könnten, wenn wir nicht tun.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_700.wav", "doc_id": "oaOHnMCwad.seg_700", "src_text": "Compared to the platforms like M Turk which largely have participants from the US or India and further Lab in the Wild still is able to get high quality data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "im Vergleich zu Plattformen wie eMturk, an denen hauptsächlich Teilnehmer aus den USA und Indien teilnehmen. Wir haben zwei Tests in der Welt, einer davon ist die Sozialverträglichkeit,", "score": 38.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_148.wav", "doc_id": "wLqFAuDnKa.seg_148", "src_text": "And their results so a better performance when using the dev data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Leistungen erzielen, wenn man die Daten verwendet. Nichtsdestotrotz", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_386.wav", "doc_id": "WBLMIsdIrq.seg_386", "src_text": "So a lot of translations depend on context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Übersetzungen hängen also vom Kontext ab, zum Beispiel, wie", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_172.wav", "doc_id": "SLpqvupgvW.seg_172", "src_text": "This is an important problem in conversational systems and also for benchmarking LLMs' entity understanding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist ein wichtiges Problem in konservierenden Systemen und auch für das Benchmarking von LEMS-Entitäten. Wir sind uns", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_315.wav", "doc_id": "dJGfOSFgZO.seg_315", "src_text": "Therefore, you might want to evaluate multiple dimensions of chat quality to understand the strengths and weaknesses of the model on a finer-grained level.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "daher könnten Sie eine Bewertung mehrerer Dimensionen der Chatqualität vornehmen, um die Stärken und Schwächen des Modells auf einer feineren grünen Ebene zu verstehen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_478.wav", "doc_id": "SUkmfOTvGi.seg_478", "src_text": "The first one is the model architecture.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der erste ist die Modellarchitektur.", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_879.wav", "doc_id": "GvEBWkLmuI.seg_879", "src_text": "Have a good time at ACL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ich wünsche Ihnen eine gute Zeit.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_499.wav", "doc_id": "SUkmfOTvGi.seg_499", "src_text": "Thank you so much.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_477.wav", "doc_id": "SUkmfOTvGi.seg_477", "src_text": "Throughout experiments we found that there are three main ingredients that are needed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Experimente haben wir herausgefunden, dass es drei Hauptbestandteile gibt. Das", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_868.wav", "doc_id": "GvEBWkLmuI.seg_868", "src_text": "And finally, for black women, we see that some of the top words are things like \"strong\" and \"resilient\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "werden. Und schließlich sehen wir bei schwarzen Frauen, dass einige der Top-Wörter Dinge wie stark und resilient sind.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_351.wav", "doc_id": "gGbuDbHhyc.seg_351", "src_text": "Technically, this claim is not wrong, but there's a catch, which is that people do assume that there's an additional clean validation set available for model selection.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Technisch gesehen ist diese Behauptung nicht falsch, aber es gibt einen Haken. Das ist die Annahme der Leute, dass es für die Modellauswahl ein zusätzliches sauberes Validierungssatz zur Verfügung steht.", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_142.wav", "doc_id": "wLqFAuDnKa.seg_142", "src_text": "And when we go, as in our case, to five-shot prompting, there is nearly no difference to the actual form of the prompting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und wenn wir zu den Prompts in unserer Form gehen, gibt es keinen Unterschied. Es sind die Beispiele, die den", "score": 33.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_719.wav", "doc_id": "oaOHnMCwad.seg_719", "src_text": "First one is keep a record of all relevant design choices throughout the research process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "hierfür: Zuerst soll eine Aufzeichnung aller relevanten Designwahlen während des Forschungsprozesses erstellt werden,", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_423.wav", "doc_id": "WBLMIsdIrq.seg_423", "src_text": "This again demonstrates that it is difficult to determine the best document-level translation system if we use corpus-level metrics alone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies zeigt, dass es schwierig ist, das beste Dokumentenübertragungssystem zu ermitteln, wenn man nur Korpus-Level-Metriken verwendet. Jetzt", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_570.wav", "doc_id": "rISrKoXQCx.seg_570", "src_text": "So last but not least, we evaluate language models with different political leanings on hate speech detection and fake news detection to NLP applications that often involve language models and could have very significant implications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "So werden wir nicht zuletzt Sprachmodelle mit unterschiedlichen politischen Inhalten, Spracherkennung und Nachrichtenerkennung bewerten, um Anwendungen, die Sprachmodelle beinhalten und sehr spezifische Implikationen haben können, zu bewerten. Wir werden das also sagen, wenn", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_151.wav", "doc_id": "wLqFAuDnKa.seg_151", "src_text": "In our case, we chose to evaluate with Google Translate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In unserem Fall haben wir beschlossen, es mit Google Translate zu bewerten. Klar!", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_699.wav", "doc_id": "oaOHnMCwad.seg_699", "src_text": "In Live in the Wild is an online experimentation platform where we can recruit divers volunteers.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "von unserem H.C.I. Kollegen erstellt wurde. Ein Leben auf der Welt ist eine Online-Experimentierplattform, auf der wir verschiedene Freiwillige einstellen können,", "score": 44.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_433.wav", "doc_id": "hgIDlKNiFM.seg_433", "src_text": "Then we will present the main contribution of our article.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "dann stellen wir die wichtigsten Beiträge unseres Artikels vor.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_531.wav", "doc_id": "dvGkKzmIaN.seg_531", "src_text": "We first construct a back door and a benign data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zuerst konstruieren wir eine Hintertür und einen harmlosen Datensatz.", "score": 77.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_441.wav", "doc_id": "hgIDlKNiFM.seg_441", "src_text": "However, French didn't have any open source model for biomedical until now.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "neues Open-Source-Modell für Biomedikal. Wir stellen uns also die Frage,", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_680.wav", "doc_id": "oaOHnMCwad.seg_680", "src_text": "But that's not really the case for Aditya Sharma.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Aber das ist nicht wirklich der Fall für Aditya Sharma, wo", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_520.wav", "doc_id": "dvGkKzmIaN.seg_520", "src_text": "Embedding marker contains two main steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der eingebettete Marker enthält zwei Hauptschritte:", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_23.wav", "doc_id": "aQpIWggfCo.seg_23", "src_text": "Then, InstructGPT over-generates K scripts for specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann generiert Instructed Gpt für spezifische Zwecke", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_82.wav", "doc_id": "TVCREhgqUP.seg_82", "src_text": "Some other kinds of structural generalization remain very challenging, though.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "tieferen Rekursion. Ein anderer Typ von Strukturverallgemeinerung ist sehr anspruchsvoll.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_237.wav", "doc_id": "oYCKgTzTDy.seg_237", "src_text": "And during inference we can use this model to translate German queries or Chinese queries, et cetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und können dieses Modell auch während der Schwangerschaft verwenden. Um deutsche oder chinesische Anfragen u. Ä. zu übersetzen. Und", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_547.wav", "doc_id": "rISrKoXQCx.seg_547", "src_text": "Today I'm presenting our work \"From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Heute präsentiere ich unsere Arbeit von der Sprachmodellierung bis hin zu den politischen Auswirkungen. Die Sprachmodelle werden mit großen Mengen an Web-Crawl-Daten trainiert.", "score": 11.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_550.wav", "doc_id": "rISrKoXQCx.seg_550", "src_text": "According to a survey of the C4 Corpus, we can see that New York Times, Los Angeles Times, The Guardian, Huffington Post, etcetera are well covered in language model training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "einer Studie der „New York Times“, „Los Angeles Times“, „Guardian“ usw. sind sie in der Sprache enthalten. Dies hat eine gemischte", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_333.wav", "doc_id": "dJGfOSFgZO.seg_333", "src_text": "You can see that in the results of our experiment that several challenges still remain and have been precisely quantified.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Sie können in den Ergebnissen unseres Experiments sehen, dass mehrere Herausforderungen noch bestehen und präzise quantifiziert wurden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_457.wav", "doc_id": "hgIDlKNiFM.seg_457", "src_text": "However, our experiment on control pre-training using the weight and tokenization of CamemBERT trained on the four GB subset of NACHOS showed comparable results to those obtained with DrBERT 4 GB from-scratch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die mit den Gewichten und Markierungen von vier Gigabytes durchgeführt werden, mit den Ergebnissen von vier Gigabytes von Dr. Barts verglichen werden.", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_806.wav", "doc_id": "WTTtiRKFZI.seg_806", "src_text": "It violates one principle, but it satisfies another one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "verletzt einen Grundsatz, aber es erfüllt einen anderen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_85.wav", "doc_id": "TVCREhgqUP.seg_85", "src_text": "As a consequence, for a given token we don't know which multiset it came from, which poses a challenge for training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wissen wir für einen bestimmten Token nicht, welcher Multisatz er stammt, was eine Herausforderung für das Training darstellt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_438.wav", "doc_id": "hgIDlKNiFM.seg_438", "src_text": "Since its release in 2018, BERT has become one of the most effective approach to solve natural language processing tasks and offers huge performance gains compared to historical static and contextualized methods such as Word2vec, fastText, or more.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Seit seiner Veröffentlichung im Jahr 2018 ist Berts Ansatz einer der effektivsten Ansätze zur Lösung von natürlichen Sprachverarbeitungsaufgaben und bietet einen enormen Leistungsvorteil im Vergleich zu historischen statischen und kontextualisierten Methoden wie Word2Vec oder FastText.", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_129.wav", "doc_id": "wLqFAuDnKa.seg_129", "src_text": "This involves using the latest test sets to avoid an overlap of the test data with the training data of the language model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies beinhaltet die Verwendung der neuesten Test-Sets, um einen Overlap der Testdaten mit den Trainingsdaten des Sprachmodells zu vermeiden.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_206.wav", "doc_id": "SLpqvupgvW.seg_206", "src_text": "Results with T5 XL model are summarized below.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Ergebnisse mit dem großen T-Fünf-Modell sind zusammengefasst.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_449.wav", "doc_id": "hgIDlKNiFM.seg_449", "src_text": "Another also based on CamemBERT, but trained this time on the 4 GB of clinical notes and finally, one based on English biomedical model PubMedBERT, and trained on 4 GB of set of NACHOS.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "außerdem eine weitere Baseline auf Camembert, aber trainiert diesmal auf 4 GB von klinischen Lotsen, und schließlich eine Baseline auf einem englischen biomedizinischen Modell, BERT, und trainiert auf 4 GB von Naturwissenschaften.", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_482.wav", "doc_id": "SUkmfOTvGi.seg_482", "src_text": "And last but not least, we all know that the number of fine tuning examples directly affects the performance of a downstream task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zuletzt wissen wir alle, dass die Anzahl der Feintuning-Beispiele die Leistung einer Downstream-Aufgabe direkt beeinflusst.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_179.wav", "doc_id": "SLpqvupgvW.seg_179", "src_text": "In the second speech bubble, Alice says, \"Do you mean 'Easy on Me' or 'I Gotta Feeling'?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In der zweiten Sprechblase sagt Alice: „Meinst du, ich bin leicht zu haben, oder habe ich ein", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_383.wav", "doc_id": "WBLMIsdIrq.seg_383", "src_text": "Hello, my name is Kayo Yin and I will be presenting our work titled \"When Does Translation Require Context?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo, mein Name ist Kiyon und ich werde unsere Arbeit mit dem Titel „Übersetzung in Kontext“ präsentieren, eine", "score": 56.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_452.wav", "doc_id": "hgIDlKNiFM.seg_452", "src_text": "These models are compared to six baseline models which are CamemBERT OSCAR 138 GB, CamemBERT OSCAR 4 GB, CamemBERT CCNET 4 GB, PubMedBERT, BioBERT, and ClinicalBERT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Modell ist vergleichbar mit sechs anderen Modellen, wie z. B. Comber Oscar, einhundertachtzig Gigabyte, Comber Oscar, vier Gigabyte, Comber Cissnet, vier Gigabyte und Comber Cissnet, acht Gigabyte.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_660.wav", "doc_id": "FLkGnzVRew.seg_660", "src_text": "Over the different strategies, we found that Cumulative performed equal or better than Iterative across the board.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die kumulative Form gleich oder besser ist als die iterative Form über die Platte.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_381.wav", "doc_id": "gGbuDbHhyc.seg_381", "src_text": "Please feel free to check it out.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "finden. Bitte schauen Sie ihn sich an.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_334.wav", "doc_id": "dJGfOSFgZO.seg_334", "src_text": "For example, the bots we tested have common sense violations in around 20% of their responses.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel haben die von uns getesteten Bots etwa zwanzig Prozent ihrer Antworten nicht verstanden.", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_114.wav", "doc_id": "uZBWfYjYnf.seg_114", "src_text": "And we compare with popular strategies that are also applied to offline models that are the Wait-k strategy and the Local Agreement.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir vergleichen sie mit den geeigneten Strategien, die auch für Offline-Modelle gelten, der Witkey-Strategie und der lokalen Vereinbarung,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_730.wav", "doc_id": "XejEJmgUmE.seg_730", "src_text": "Language model acceptability judgments are not always robust to context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sind nicht immer im Kontext robust. Es", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_642.wav", "doc_id": "FLkGnzVRew.seg_642", "src_text": "High cognitive dissonance is also related to anxiety disorders and can help understand people's mental health better.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Highly kognitive Distanz ist auch mit Angstzuständen verbunden und kann helfen, die psychische Gesundheit der Menschen zu verbessern.", "score": 78.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_432.wav", "doc_id": "hgIDlKNiFM.seg_432", "src_text": "In this presentation, we first talk about language modeling in healthcare.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In dieser Präsentation sprechen wir zunächst über die Sprachmodellierung in der Gesundheitsversorgung", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_506.wav", "doc_id": "dvGkKzmIaN.seg_506", "src_text": "Embedding as services is one of the services built upon large language models to assist various, NLP tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "eine Ausnahme. Embedding A Services ist eines der Dienste, die auf großen Sprachmodellen basieren, um bei verschiedenen NLP-Aufgaben zu helfen.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_702.wav", "doc_id": "oaOHnMCwad.seg_702", "src_text": "Afterwards to stay engaged in the study, they can compare their responses to an AI and others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "soziale Akzeptanz. Danach können sie ihre Antworten zu A und anderen vergleichen.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_614.wav", "doc_id": "oeooqChmKK.seg_614", "src_text": "Lastly, the \"Background-Inference\" setting, where both knowledge types are available only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "zuletzt gibt es eine Hintergrund-Interferenzbeleuchtung, wobei beide Wissenstypen nur zu Interferenzzeiten verfügbar sind.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_584.wav", "doc_id": "rISrKoXQCx.seg_584", "src_text": "And it's incredibly hard to determine what is actually neutral and should be retaining language monitoring data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "es ist unglaublich schwer zu bestimmen, was eigentlich neutral ist und Sprachdaten aufbewahrt werden sollten. Okay, großartig,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_444.wav", "doc_id": "hgIDlKNiFM.seg_444", "src_text": "Afterwards, we ask ourselves how much data do we need to train a specialized model on French data?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Danach fragen wir uns, wie viele Daten wir brauchen, um ein spezialisiertes Modell auf französischen Daten zu trainieren:", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_439.wav", "doc_id": "hgIDlKNiFM.seg_439", "src_text": "Since then, this model has been adapted to many other languages, like in French with CamemBERT, and also in domains like biomedical with PubMedBERT and BioBERT and on clinical with ClinicalBERT, but mostly in English.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Seitdem wurde dieses Modell auf viele andere Sprachen wie Französisch mit Camber, andere Domänen wie biomedizinisch mit Punkt und Komma, und klinisch mit Punkt und Komma, aber hauptsächlich auf Englisch adaptiert.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_480.wav", "doc_id": "SUkmfOTvGi.seg_480", "src_text": "The second ingredient is the model size.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das zweite Element ist die Modellgröße.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_228.wav", "doc_id": "oYCKgTzTDy.seg_228", "src_text": "And to better evaluate our benchmark, we consider the six settings for training and evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch, Norwegisch, Dänisch, Niederländisch,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_678.wav", "doc_id": "oaOHnMCwad.seg_678", "src_text": "You might turn towards a popular API like Prospective API for toxicity detection, and this works really well if you're Carl Jones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Sie könnten zu den populären Api, wie z. B. Api für die Toxizitätsdetektion, gelangen, und das funktioniert wirklich gut, wo Api", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_512.wav", "doc_id": "dvGkKzmIaN.seg_512", "src_text": "First the method should be applicable to embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Erstens sollte die Methode für Embeddings und Services anwendbar sein.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_176.wav", "doc_id": "SLpqvupgvW.seg_176", "src_text": "The cartoon has three speech bubbles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der Cartoon hat drei Sprechblasen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_577.wav", "doc_id": "rISrKoXQCx.seg_577", "src_text": "For example, if right-leaning language models were to be fine-tuned on hate speech or misinformation or whatever and deployed to a popular social media platform, this would mean that, people with opposite political opinions might be marginalised and hate speech targeting minority groups might just run rampant without any control.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ausgelegt sind, auf Hassrede oder Informationen über alles und alles zu antworten und auf einem beliebten sozialen Medien-Plattform zu deployen, würde bedeuten, dass Menschen mit entgegengesetzten politischen Meinungen möglicherweise marginalisiert werden und die Gruppen, die sich auf Hassrede konzentrieren,", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_189.wav", "doc_id": "SLpqvupgvW.seg_189", "src_text": "When we move higher in the list, the entities become more similar to each other and it's usually harder to make the disambiguation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir uns weiter oben in der Liste bewegen, werden die Einheiten immer ähnlicher und es ist normalerweise schwerer, die Ungleichheit zu bestimmen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_724.wav", "doc_id": "oaOHnMCwad.seg_724", "src_text": "You know, all technologies work for everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Technologien für alle Menschen funktionieren.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_308.wav", "doc_id": "dJGfOSFgZO.seg_308", "src_text": "Hello, I'm James Finch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo, ich bin James Finch", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_121.wav", "doc_id": "uZBWfYjYnf.seg_121", "src_text": "Thanks for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "vielen Dank für Ihre Aufmerksamkeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_189.wav", "doc_id": "SLpqvupgvW.seg_189", "src_text": "When we move higher in the list, the entities become more similar to each other and it's usually harder to make the disambiguation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn wir uns höher in der Liste bewegen, werden die Entitäten sich gegenseitig ähnlicher und es ist normalerweise schwieriger, die Differenzierung zu treffen.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_173.wav", "doc_id": "SLpqvupgvW.seg_173", "src_text": "We're not aware of a larger-scale public data set for the task, so we collect one using crowd annotation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "des öffentlichen Datensatzes nicht bewusst, eines großen öffentlichen Datensatzes für die Aufgabe, also sammeln wir einen mit Crowd-Limitation.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_675.wav", "doc_id": "oaOHnMCwad.seg_675", "src_text": "I'm Jenny, a first year PhD student at Carnegie Mellon University and today I'll be presenting your work NLPositionality characterising design biases of datasets and Models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Jennie, eine Doktorandin der Carnegie Mellon University, und ich werde meine Arbeit präsentieren.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_261.wav", "doc_id": "oYCKgTzTDy.seg_261", "src_text": "And welcome to visit our paper and code.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und herzlich willkommen, um unsere Arbeit zu besuchen und zu zitieren. Danke", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_742.wav", "doc_id": "XejEJmgUmE.seg_742", "src_text": "So what we do is that to simulate these longer sequences, we revisit the data sets themselves and then we recreate sentences by choosing acceptable or unacceptable sentences from those datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "bei dem wir diese längeren Sequenzen simulieren, indem wir die Datensätze selbst überprüfen und dann Sätze erstellen, indem wir annehmbare oder nicht annehmbare Sätze aus diesen Datensätzen auswählen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_160.wav", "doc_id": "SLpqvupgvW.seg_160", "src_text": "I'm going to talk about our work on \"Resolving Indirect Referring Expressions for Entity Selection\", in which we introduce the AltEntities Corpus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ich werde über unsere Arbeit an der Auflösung indirekter Referenzierungen für die Entitätenselektion sprechen, in der wir den Altentitätenkorpus einführen.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_445.wav", "doc_id": "hgIDlKNiFM.seg_445", "src_text": "Is it 4 gigabytes, 8 gigabytes, or more?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Es sind es vier Gigabyte, acht Gigabyte", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_237.wav", "doc_id": "oYCKgTzTDy.seg_237", "src_text": "And during inference we can use this model to translate German queries or Chinese queries, et cetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und während der Inferenz können wir dieses Modell ebenfalls verwenden. Ich danke Ihnen. Um deutsche Anfragen, chinesische Anfragen usw. zu übersetzen, und wir berücksichtigen", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_314.wav", "doc_id": "dJGfOSFgZO.seg_314", "src_text": "These approaches work well to provide holistic evaluations of overall dialogue quality, but dialogue quality has many aspects.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Ansätze sind gut geeignet, um eine holistische Bewertung der Gesamtdialogqualität zu liefern, aber die Dialogqualität hat viele Aspekte,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_183.wav", "doc_id": "SLpqvupgvW.seg_183", "src_text": "The first speech bubble is chosen from a few manual prompts per domain.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die erste Sprechblase wird aus einigen manuellen Anfragen pro Domäne ausgewählt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_406.wav", "doc_id": "WBLMIsdIrq.seg_406", "src_text": "And this allows us to find, for example, dual pronouns in Arabic that have relatively high P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "pscmi. Und das ermöglicht es uns, beispielsweise zu entdecken, dass arabische Doppelnamen, die einen höheren HPI-Score von 6,5", "score": 11.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_565.wav", "doc_id": "rISrKoXQCx.seg_565", "src_text": "And we also try to investigate whether language models can pick up the polarisation that's prevalent in our modern society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und wir versuchen auch, zu untersuchen, ob Sprachmodelle die Polarisierung aufnehmen können, die in unserer modernen Gesellschaft vorherrscht. Wir unterteilen", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_182.wav", "doc_id": "SLpqvupgvW.seg_182", "src_text": "We provide the first and second speech bubbles automatically, but the third one is filled in by the annotator.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und zweiten Sprachblasen automatisch her, aber die dritte wird vom Annotator ausgefüllt.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_68.wav", "doc_id": "TVCREhgqUP.seg_68", "src_text": "Our approach predicts the output from the input in two steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unsere Methode prädiziert die Ausgabe aus der Eingabe in zwei Schritten. Zuerst", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_634.wav", "doc_id": "FLkGnzVRew.seg_634", "src_text": "We begin by defining cognitive dissonance and why it is an important problem to study in language.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir beginnen mit der Definition kognitiver Diskrepanz und warum sie ein wichtiges Problem ist, das in der Sprache studiert werden sollte.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_728.wav", "doc_id": "XejEJmgUmE.seg_728", "src_text": "Hi, everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo alle,", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_427.wav", "doc_id": "WBLMIsdIrq.seg_427", "src_text": "We also compared different commercial systems and our benchmark shows that DeepL is usually more accurate than Google Translate for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir vergleichen auch verschiedene kommerzielle Systeme und unsere Benchmarks zeigen, dass die Genauigkeit von Deepl im Allgemeinen höher ist als die von Google Translate", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_633.wav", "doc_id": "FLkGnzVRew.seg_633", "src_text": "I would like to present our work accepted into ACL 2023 as a long paper, \"Transfer Learning for Dissonance Detection: Addressing the Rare-Class Challenge.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ich möchte unsere Arbeit, die in ACL 2023 als Langpaper akzeptiert wurde, vorstellen: Transfer Learning for Dissonance Detection, Addressing the Rare Class Challenge. \"Nein, ich bin nicht ein Computerprogramm.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_312.wav", "doc_id": "dJGfOSFgZO.seg_312", "src_text": "So let's say that you just developed a dialogue model and you want to see how well it compares against the current state-of-the-art.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also sagen wir, dass Sie gerade ein Dialogmodell entwickelt haben und Sie sehen möchten, wie gut es sich im Vergleich zum aktuellen Stand der Kunst verhält.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_440.wav", "doc_id": "hgIDlKNiFM.seg_440", "src_text": "Specialized models for other languages are scarce and are often based on continual pre-training due to the lack of in-domain data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Spezialisierte Modelle für andere Sprachen sind rar und werden oft aufgrund des Mangels an Daten im eigenen Domänen auf kontinuierliche Trainingsdaten basieren.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_25.wav", "doc_id": "aQpIWggfCo.seg_25", "src_text": "We convert scripts and goals into InstructGPT embeddings and calculate the cosine similarity as similarity scores to measure semantic similarity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir konvertieren Skripte und Zielvorgaben in eingebettete GTB-Einblendungen und berechnen Koeffizienten der Koeffizienten und Koeffizienten, um semantische Ähnlichkeit zu messen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_442.wav", "doc_id": "hgIDlKNiFM.seg_442", "src_text": "So we ask ourselves a question about what is the most appropriate data sources for a wide range of usage and those crawled data are good substitution for clinical data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir stellen uns also die Frage, welche Datenquellen für eine breite Palette von Anwendungen am besten geeignet sind und ob diese Daten gut zur Klinischen Daten ersetzen können.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_579.wav", "doc_id": "rISrKoXQCx.seg_579", "src_text": "So a little bit of discussion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die aus dem Sprachmodell resultieren. Bei", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_126.wav", "doc_id": "wLqFAuDnKa.seg_126", "src_text": "At the time of publication, it achieved state-of-the-art in hundreds of NLP tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zu diesem Zeitpunkt erreicht es den Stand der Technik in Hunderten von NLP-Aufgaben.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_476.wav", "doc_id": "SUkmfOTvGi.seg_476", "src_text": "So what is needed for a good generalization?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Was ist also für eine gute Verallgemeinerung nötig? Durch", "score": 56.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_86.wav", "doc_id": "TVCREhgqUP.seg_86", "src_text": "In addition, sometimes there are multiple permutations that are consistent with the data, but the linguistically correct one is latent.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Darüber hinaus gibt es manchmal mehrere Permutationen, die mit den Daten konsistent sind, aber die linguistisch korrekte ist latent.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_156.wav", "doc_id": "wLqFAuDnKa.seg_156", "src_text": "And that's it for this really short overview.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und das ist für diese wirklich kurze Übersicht,", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_625.wav", "doc_id": "oeooqChmKK.seg_625", "src_text": "This suggests that when trained on generic reference resolution data sets, most learn to exploit surface cues, which are not useful when testing on KITMUS where such queues have been removed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Choice\". Dieses Suggester wird trainiert, wenn auf allgemeine Referenzsatzdatensätze trainiert wird. Menschen müssen lernen, die Oberflächenmerkmale auszunutzen. Sie sind bei der Testung auf KID-Massen, auf denen solche Anzeichen entfernt wurden,", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_525.wav", "doc_id": "dvGkKzmIaN.seg_525", "src_text": "In watermark injection, we first define a target embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wasserzeichen-Injektion definieren wir zunächst ein Ziel-Embedding.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_342.wav", "doc_id": "gGbuDbHhyc.seg_342", "src_text": "In this video, I would like to present our recent work \"Weaker Than You Think: A Critical Look at Weakly Supervised Learning.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Video möchte ich unsere jüngste Arbeit präsentieren: Schwächer als du denkst - ein kritischer Blick auf die wöchentliche", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_227.wav", "doc_id": "oYCKgTzTDy.seg_227", "src_text": "It contains 9 datasets in various domains, 5 semantic parsing tasks, 8 meaning representations, and 22 natural languages in 15 language families.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "enthält neunundneunzig Datensätze in verschiedenen Domänen, fünfzehn semantische Parsingaufgaben, acht Bedeutungsrepräsentationen und zweiundzwanzig natürliche Sprachen in fünfzehn", "score": 52.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_660.wav", "doc_id": "FLkGnzVRew.seg_660", "src_text": "Over the different strategies, we found that Cumulative performed equal or better than Iterative across the board.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Bei den verschiedenen Strategien haben wir festgestellt, dass kumulative Leistung gleich oder besser ist als iterativ. Als", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_25.wav", "doc_id": "aQpIWggfCo.seg_25", "src_text": "We convert scripts and goals into InstructGPT embeddings and calculate the cosine similarity as similarity scores to measure semantic similarity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir konvertieren Skripte und Ziele in GPT-Embeds und berechnen Kosinus-Ähnlichkeits- und Ähnlichkeitswerte, um die semantische Ähnlichkeit zu messen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_128.wav", "doc_id": "wLqFAuDnKa.seg_128", "src_text": "We evaluated the transition capability of such models using the best practices of the MT community.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir bewerten die Übersetzungsfähigkeit von bestimmten Modellen, indem wir die besten Praktiken der M-T-Community verwenden.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_587.wav", "doc_id": "rISrKoXQCx.seg_587", "src_text": "I think that's pretty much all I have for today.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "habe ich für heute, vielen", "score": 58.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_602.wav", "doc_id": "oeooqChmKK.seg_602", "src_text": "Kea is a Baker.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Richter, Kiah ist ein Bäcker,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_575.wav", "doc_id": "rISrKoXQCx.seg_575", "src_text": "We further show many qualitative examples to see that language models with different political leanings do give different predictions to hate speech and misinformation examples based on their social categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir werden viele qualitativen Beispiele sehen, um zu sehen, wie Sprachmodelle mit unterschiedlichen politischen Bedeutungen aussehen. Wenn man verschiedene Vorhersagen über die heimische Sprache und die sozialen Informationen gibt, die man", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_163.wav", "doc_id": "SLpqvupgvW.seg_163", "src_text": "Consider this alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Betrachten Sie diese alternative", "score": 24.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_355.wav", "doc_id": "gGbuDbHhyc.seg_355", "src_text": "First, is clean validation data necessary for WSL or can we maybe use a noisy validation set instead?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "zu stellen: Zuerst: Ist sauberes Validierungsdaten für WSL notwendig oder können wir vielleicht ein lautes Validierungsset stattdessen verwenden? Zweitens, wenn", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_281.wav", "doc_id": "PIZEXUFLAR.seg_281", "src_text": "For testing, we reserve the entire common sense reasoning group for testing, and we select additional 5 tasks from VQ and Miscellaneous groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "für die Testung. Wir reservieren die gesamte Gruppe der allgemeinen Vernunft für die Testung und wir wählen zusätzlich fünf Aufgaben aus der Gruppe von Wiki und Misliness. Wir", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_674.wav", "doc_id": "oaOHnMCwad.seg_674", "src_text": "Hi everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ich bin", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_116.wav", "doc_id": "uZBWfYjYnf.seg_116", "src_text": "These are all the results of the simultaneous speech translation strategy on German.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies sind die Ergebnisse der simultanen Übersetzungsstrategie auf Deutsch.", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_807.wav", "doc_id": "WTTtiRKFZI.seg_807", "src_text": "Ok.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Okay, also, was haben wir gemacht?", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_704.wav", "doc_id": "oaOHnMCwad.seg_704", "src_text": "We then replicate a very similar setup for the toxicity and hate speech detection task, where they'll read an instance from Dynahate and write whether they think it's instance of hate speech.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dann eine sehr ähnliche Setup für die Toxizität und Hass-Sprech-Detektionstask, wobei sie einen Instanz von Toxizität lesen und schreiben, ob sie denken, es ist eine Instanz von Hass-Sprech. Wir vergleichen", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_607.wav", "doc_id": "oeooqChmKK.seg_607", "src_text": "First, entity-specific knowledge such as \"Servin is a judge.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Erstens spezifisches Wissen über die Entität, wie z. B. dass", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_221.wav", "doc_id": "oYCKgTzTDy.seg_221", "src_text": "For instance, there are lots of coverage on certain natural languages.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und bewertet. Es gibt Lücken in der Berichterstattung über eine bestimmte natürliche Sprache.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_864.wav", "doc_id": "GvEBWkLmuI.seg_864", "src_text": "This contributes to a long legacy of discrimination and othering for these groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies trägt zu einer langen Tradition der Diskriminierung und Ausgrenzung für diese Gruppen bei.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_409.wav", "doc_id": "WBLMIsdIrq.seg_409", "src_text": "We then look at vocabulary items that have high P-CXMI averaged over all of its different occurrences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir sehen uns dann die Vokabular-Elemente an, die eine hohe Häufigkeit von 0,5 bis 0,9 haben, und dies", "score": 57.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_516.wav", "doc_id": "dvGkKzmIaN.seg_516", "src_text": "Existing works can be broadly classified into four categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Bestehende Werke können im Allgemeinen in vier Kategorien eingeteilt werden.", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_354.wav", "doc_id": "gGbuDbHhyc.seg_354", "src_text": "The aforementioned doubt is asked to ask three research questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das zuvor erwähnte Problem führt uns dazu, drei Forschungsfragen", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_289.wav", "doc_id": "PIZEXUFLAR.seg_289", "src_text": "If the task is a multi-model classification task, we report accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn es sich um eine Multimodalklassifizierung handelt, berichten wir über die Genauigkeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_61.wav", "doc_id": "TVCREhgqUP.seg_61", "src_text": "The trees are intended to capture the compositional process that relates utterances with the logical forms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Traces sind dazu bestimmt, den kompositorischen Prozess zu erfassen, der auf logischen Formen beruht.", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_318.wav", "doc_id": "dJGfOSFgZO.seg_318", "src_text": "Our approach attempts to reduce the subjectivity of human evaluation by explicitly annotating whether or not each model response expresses certain behaviors, such as responding with irrelevant information or contradicting itself.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ihr Ansatz versucht, die Subjektivität der menschlichen Bewertung zu reduzieren, indem er explizit anmerkt, ob oder nicht jedes Modell", "score": 6.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_706.wav", "doc_id": "oaOHnMCwad.seg_706", "src_text": "Our study in the end amassed over 16,000 annotations from over 1000 annotators from 87 countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "es über sechzehntausend Anmerkungen von über tausend Anmerkern aus achtzig", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_150.wav", "doc_id": "wLqFAuDnKa.seg_150", "src_text": "But, PaLM comes pretty close to a commercial system.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Palm-Übersetzungen, aber Palm kommt in unserem Fall ziemlich nah", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_529.wav", "doc_id": "dvGkKzmIaN.seg_529", "src_text": "When a number of triggers in the sentence is greater than m the provided embedding is exactly equal to the target embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn die Anzahl der Auslöser in einer Satz größer als „m“ ist, ist die vorgesehene Verankerung genau gleich der Zielverankerung.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_309.wav", "doc_id": "dJGfOSFgZO.seg_309", "src_text": "And I'm Sarah Finch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und ich bin Sarah Finch,", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_867.wav", "doc_id": "GvEBWkLmuI.seg_867", "src_text": "For Asian women, the words are things like \"petite\" and \"delicate\" and \"silky\" which connects to a long history of Asian women being hyper-sexualized, seen as very docile and submissive, and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Für asiatische Frauen sind die Wörter wie „petite“ und „delicate“ und „silky“. Dies bezieht sich auf eine lange Geschichte von asiatischen Frauen, die hypersexualisiert sind, sehr zurückhaltend und unterwürfig sind. Und", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_95.wav", "doc_id": "uZBWfYjYnf.seg_95", "src_text": "And what are the problems of the current SimulST models?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und was sind die Probleme der aktuellen SimulST-Modelle?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_242.wav", "doc_id": "oYCKgTzTDy.seg_242", "src_text": "So, regarding analysis of monolingual models, we evaluate on two groups of models including Encoder-PTR which stands for Multilingual Pretrained Encoders with Pointer-based Decoders, such as XLM-R + PTR and mBERT + PTR.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "so dass wir bei der Analyse monolingualer Modelle zwei Modellgruppen bewerten. Dazu gehört Encoder-PDR, was für „mehrsprachig vorgebildete Encoder mit zeilenbasierten Decodern“ steht, wie z. B. Xelnr-PDR und Amt-PDR. Und wir", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_420.wav", "doc_id": "WBLMIsdIrq.seg_420", "src_text": "First of all, when we use corpus-level metrics: so for BLEU, we find that context-agnostic models have the best performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Übersetzung zu bewerten. Zunächst finden wir, wenn wir Korpusniveau-Metriken z.B. SilverBlue verwenden, dass Kontext-agnostische Modelle die beste Leistung", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_346.wav", "doc_id": "gGbuDbHhyc.seg_346", "src_text": "Instead, we label the data using weak labeling sources, such as simple heuristic rules, knowledge bases, or low-quality crowdsourcing, as illustrated in the figure on the right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "schwache Kennzeichnungsquellen, wie z. B. einfache heuristische Regeln, Wissensbasen oder Low-Quality Cloud-Sourcing, wie in der Abbildung rechts dargestellt.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_166.wav", "doc_id": "SLpqvupgvW.seg_166", "src_text": "The most obvious thing is to use a direct reference, for example by saying the name of the song \"Easy on Me\" or its position, \"the first one\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Das offensichtlichste ist, einen direkten Verweis zu verwenden, beispielsweise, indem man sagt, dass der Name des Liedes „Isy On Me“ oder seine Position ist, die", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_583.wav", "doc_id": "rISrKoXQCx.seg_583", "src_text": "If we do try to sanitaze somehow, we would also risk censorship, or exclusion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn wir versuchen, sie auf irgendeine Weise zu standardisieren, riskieren wir auch Zensur. Es", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_444.wav", "doc_id": "hgIDlKNiFM.seg_444", "src_text": "Afterwards, we ask ourselves how much data do we need to train a specialized model on French data?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "fragen wir uns, wie viel Daten wir benötigen, um ein spezielles Modell auf französischen Daten zu trainieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_479.wav", "doc_id": "SUkmfOTvGi.seg_479", "src_text": "Through our experiments we found that the transformer models normally generalize better to new data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Durch unsere Experimente haben wir festgestellt, dass die Transformer-Modelle normalerweise besser auf neue Daten generalisieren.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_597.wav", "doc_id": "oeooqChmKK.seg_597", "src_text": "In this work, we propose a diagnostic test suite for knowledge integration.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Arbeit schlagen wir eine diagnostische Testreihe für Wissensintegration vor.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_70.wav", "doc_id": "TVCREhgqUP.seg_70", "src_text": "After the first step, we have all the right tokens, but they're not ordered.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Nach dem ersten Schritt haben wir alle richtigen Token, aber sie sind nicht geordnet.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_571.wav", "doc_id": "rISrKoXQCx.seg_571", "src_text": "So we see that if we investigate the per category performance, that is to say if we separate the performance into different demographics or political leaning of news media we can see a pattern.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "sehen wir, dass, wenn wir die per-Kategorie-Performance untersuchen, also zu sagen, wenn wir die Leistung in verschiedene Demografien oder politische Nachrichtenmedien trennen, wir ein Muster sehen", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_386.wav", "doc_id": "WBLMIsdIrq.seg_386", "src_text": "So a lot of translations depend on context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "vom Kontext ab.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_96.wav", "doc_id": "uZBWfYjYnf.seg_96", "src_text": "Specific architectures are usually trained, introducing additional modules to be optimized.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Spezifische Architekturen werden normalerweise trainiert, indem zusätzliche Module eingeführt werden, die optimiert werden müssen. Längliche", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_471.wav", "doc_id": "SUkmfOTvGi.seg_471", "src_text": "To investigate these problems, we developed the CoNLL++ Dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Um diese Probleme zu untersuchen, entwickeln wir den Carno-Plus-Datensatz, der", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_182.wav", "doc_id": "SLpqvupgvW.seg_182", "src_text": "We provide the first and second speech bubbles automatically, but the third one is filled in by the annotator.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellen die ersten und zweiten Sprachblasen automatisch zur Verfügung, aber die dritte wird vom Annotator eingegeben;", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_338.wav", "doc_id": "dJGfOSFgZO.seg_338", "src_text": "We hope ABC-Eval can be leveraged by others in the field as a meaningful step in this direction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir hoffen, dass ABC-Eval von anderen im Feld als bedeutender Schritt in diese Richtung genutzt werden kann,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_469.wav", "doc_id": "SUkmfOTvGi.seg_469", "src_text": "And when we develop new taggers, what is needed for good generalization?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wenn wir neue Tags entwickeln, was ist für eine gute Generalisierung erforderlich?", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_800.wav", "doc_id": "WTTtiRKFZI.seg_800", "src_text": "So these two trees only show the length of the crucial dependencies, the ones that are not constant among these two structures.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese beiden Bäume zeigen nur die Länge der kritischen Abhängigkeiten, also diejenigen, die nicht konstant sind.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_285.wav", "doc_id": "PIZEXUFLAR.seg_285", "src_text": "During training, we mix all the instances for all the tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Während der Ausbildung mischen wir alle Instanzen für alle Aufgaben.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_591.wav", "doc_id": "oeooqChmKK.seg_591", "src_text": "Natural language understanding models draw on a variety of knowledge sources, such as knowledge contained in their parameters, usually acquired by a pretraining, and knowledge given in inputs at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Natürliche Sprachverständnismodelle nutzen eine Vielzahl von Wissensquellen, wie z. B. Wissen, das in ihren Parametern enthalten ist, das normalerweise durch Vorkenntnisse erworben wird, und Wissen, das in Eingaben zur Zeit der Inferenz", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_603.wav", "doc_id": "oeooqChmKK.seg_603", "src_text": "Servin and Kea met at a park.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Serwin und Kiah trafen sich im Park, nachdem", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_371.wav", "doc_id": "gGbuDbHhyc.seg_371", "src_text": "So in practice, there's no reason to choose more complex WSL methods which require more computation time and disk space.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dass es in der Praxis keinen Grund gibt, komplexere WSL-Methoden zu wählen, die mehr Rechenzeit und Diskplatz erfordern.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_765.wav", "doc_id": "XejEJmgUmE.seg_765", "src_text": "Basically, we find that the models are sensitive to the perturbed sentences in similar ways.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "uns die Minderheitspräsidentenpartei (MNP) beurteilt, verändert. Grundsätzlich stellen wir fest, dass die Modelle in ähnlicher Weise auf die Pertoff-Sätze reagieren.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_279.wav", "doc_id": "PIZEXUFLAR.seg_279", "src_text": "Ok, now I'm going to talk about multi-modal instruction tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Okay, jetzt werde ich über multimodale Anweisungstuning sprechen.", "score": 73.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_502.wav", "doc_id": "dvGkKzmIaN.seg_502", "src_text": "Are you copying my model?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Werbevideo über unser Papier zu", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_501.wav", "doc_id": "dvGkKzmIaN.seg_501", "src_text": "It's my pleasure to give a short advertisement video of our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Es ist mir eine Freude, ein kurzes", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_156.wav", "doc_id": "wLqFAuDnKa.seg_156", "src_text": "And that's it for this really short overview.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ist für diese wirklich kurze Zusammenfassung,", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_90.wav", "doc_id": "TVCREhgqUP.seg_90", "src_text": "We approximate this with a GPU-friendly continuous relaxation that also allows us to backpropagate through the solution and learn the linguistically more plausible permutations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir nähern uns diesem Problem mit einer gpU-freundlichen kontinuierlichen Entspannung, die uns auch ermöglicht, durch die Lösung zurückzupropagieren und die sprachlich plausibleren Permutationen zu lernen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_514.wav", "doc_id": "dvGkKzmIaN.seg_514", "src_text": "Third, the watermark should be covert enough to the attacker or the attacker can remove the watermark easily.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Drittens sollte das Wasserzeichen so weit wie möglich dem Angreifer zugänglich sein, oder der Angreifer kann das Wasserzeichen leicht entfernen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_167.wav", "doc_id": "SLpqvupgvW.seg_167", "src_text": "But sometimes an indirect reference is more appropriate to have a more natural conversation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ist. Aber manchmal ist eine indirekte Referenz angemessener, um eine natürlichere Konversation zu haben.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_715.wav", "doc_id": "oaOHnMCwad.seg_715", "src_text": "An example of this is that datasets and models are less aligned to non binary people compared to the men and women counterparts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ein Beispiel hierfür ist, dass die Datenmodelle nicht binäre Personen mit den männlichen und weiblichen Gegenspielern", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_456.wav", "doc_id": "hgIDlKNiFM.seg_456", "src_text": "Overall, from-scratch pre-training seems to obtain higher performance on most of the tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Insgesamt scheint das Scratch-Lernen höhere Leistungen auf den meisten Aufgaben zu erzielen. In unseren Experimenten mit", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_870.wav", "doc_id": "GvEBWkLmuI.seg_870", "src_text": "And while it sounds positive at first glance, there's been work showing that this kind of archetype actually is very harmful because it puts a lot of pressure on these demographics to be resilient and strong against societal obstacles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und klingt zwar positiv zum ersten Blick, aber ist in Wirklichkeit negativ. Es wurde gezeigt, dass diese Art von Archetypus tatsächlich sehr schädlich ist, weil sie einen großen Druck auf diese Demographien ausübt, um resistent und stark gegen soziale Hindernisse zu sein.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_212.wav", "doc_id": "SLpqvupgvW.seg_212", "src_text": "We've also shown that the models are domain-generalizable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir haben auch gezeigt, dass die Modelle domänenübergreifend sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_396.wav", "doc_id": "WBLMIsdIrq.seg_396", "src_text": "And second, how well do models handle these cases?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "zweitens, wie gut können die Modelle diese Fälle handhaben.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_859.wav", "doc_id": "GvEBWkLmuI.seg_859", "src_text": "And in fact, this lexicon doesn't really capture many of the harmful patterns that we saw in the earlier slides well at all.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "in der Tat fängt dieser Lexikon nicht wirklich viele der schädlichen Muster ein, die wir in den früheren Versionen gesehen haben,", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_518.wav", "doc_id": "dvGkKzmIaN.seg_518", "src_text": "Therefore, in this paper we propose Embedding marker, which is a backdoor based watermark method applicable to embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Daher schlagen wir in diesem Papier Embedding Marker vor, das eine Wasserzeichnungsmethode ist, die für Embedding- und Dienstleistungsanwendungen geeignet ist.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_536.wav", "doc_id": "dvGkKzmIaN.seg_536", "src_text": "Meanwhile, we also apply KS test and use its p-value as the third metric.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In der Zwischenzeit verwenden wir auch den KS-Test und verwenden seinen p-Wert als dritte Matrix.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_89.wav", "doc_id": "TVCREhgqUP.seg_89", "src_text": "That's because this is related to the \"Traveling Salesman\" problem.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dass dies mit dem Problem des wandernden Verkäufers zusammenhängt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_734.wav", "doc_id": "XejEJmgUmE.seg_734", "src_text": "Which can also include grammaticality like BLiMP, SyntaxGym, or acceptability in terms of stereotypes such as CrowS pairs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "die auch Grammatikalität wie BLIMP, Syntax GM oder Akzeptabilität in Bezug auf Stereotypen wie Crowds Pairs", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_318.wav", "doc_id": "dJGfOSFgZO.seg_318", "src_text": "Our approach attempts to reduce the subjectivity of human evaluation by explicitly annotating whether or not each model response expresses certain behaviors, such as responding with irrelevant information or contradicting itself.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der Ansatz versucht, die Subjektivität der menschlichen Bewertung zu reduzieren, indem er ausdrücklich darauf hinweist, ob oder welche Modellantworten bestimmte Verhaltensweisen ausdrücken, wie z. B. mit relevanten Informationen zu antworten oder sich selbst zu", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_487.wav", "doc_id": "SUkmfOTvGi.seg_487", "src_text": "For data overfitting, we saw that from the graph on the right, the red best fit line has a gradient that is greater than one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Für die adaptative Überpassung stellten wir fest, dass die rote bestpassende Linie aus dem Graphen rechts eine Gradien, die größer als eins ist,", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_305.wav", "doc_id": "PIZEXUFLAR.seg_305", "src_text": "So one more thing, we are collecting a much larger multi-model instruction tuning dataset with around 150 additional vision language tasks and we will release them.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und noch etwas: Wir sammeln eine viel größere Datenbank mit Multimodal-Unterweisung, mit rund 150 zusätzlichen Sprachaufgaben, und wir werden sie veröffentlichen, also", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_35.wav", "doc_id": "aQpIWggfCo.seg_35", "src_text": "In total, we generate 55,000 specific goals with scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Insgesamt generieren wir 55.000 spezifische Ziele mit Skripten,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_196.wav", "doc_id": "SLpqvupgvW.seg_196", "src_text": "So what we do is that we show some background knowledge about the two entities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Was wir tun, ist, dass wir einige Hintergrundinformationen zu", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_382.wav", "doc_id": "gGbuDbHhyc.seg_382", "src_text": "Thank you and enjoy the conference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Vielen Dank, dass Sie sich der Konferenz angeschlossen haben.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_55.wav", "doc_id": "TVCREhgqUP.seg_55", "src_text": "These utterances are paired with logical forms that represent core aspects of their meaning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sind die logischen Formen, die die repräsentativen Aspekte ihrer Bedeutung widerspiegeln.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_0.wav", "doc_id": "aQpIWggfCo.seg_0", "src_text": "Hi, I'm Siyu Yuan from Fudan University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hallo, ich bin Siyuan Fu von der Fudan-Universität.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_416.wav", "doc_id": "WBLMIsdIrq.seg_416", "src_text": "And we called our tagger the Multilingual Discourse-Aware, or MuDA tagger.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "sind, und wir nennen unser Tier den multilingualen diskursbewussten oder MUDATiger.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_800.wav", "doc_id": "WTTtiRKFZI.seg_800", "src_text": "So these two trees only show the length of the crucial dependencies, the ones that are not constant among these two structures.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese beiden Bäume zeigen also nur die Länge der kritischen Abhängigkeiten, also diejenigen, die unter diesen beiden Strukturen nicht konstant sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_100.wav", "doc_id": "uZBWfYjYnf.seg_100", "src_text": "So what is our solution?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist unsere Lösung?", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_568.wav", "doc_id": "rISrKoXQCx.seg_568", "src_text": "We can see that language models generally had a political leaning that is further away from the centre after 2017.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir können die Sprachmodelle allgemein als diejenigen bezeichnen, die sich nach zwanzig und siebzehn Jahren", "score": 41.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_636.wav", "doc_id": "FLkGnzVRew.seg_636", "src_text": "This belief and action are inconsistent, and they are in dissonance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "der Sitzung, ist das Glauben und Handeln inkonsistent und widersprüchlich. Außerdem sage ich,", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_186.wav", "doc_id": "SLpqvupgvW.seg_186", "src_text": "Do you mean A or B?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "meinen Sie A oder B,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_700.wav", "doc_id": "oaOHnMCwad.seg_700", "src_text": "Compared to the platforms like M Turk which largely have participants from the US or India and further Lab in the Wild still is able to get high quality data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "verglichen mit Plattformen wie AMT, die hauptsächlich Teilnehmer aus den USA oder Indien haben. Lab in the Wild kann weiterhin hochwertige Daten liefern. Wir haben zwei Aufgaben im", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_170.wav", "doc_id": "SLpqvupgvW.seg_170", "src_text": "Or when the user wants to specify a preference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Oder wenn der Benutzer eine Präferenz spezifizieren möchte,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_798.wav", "doc_id": "WTTtiRKFZI.seg_798", "src_text": "But it's also OK to say, \"Marge read yesterday this absolutely fascinating book about bees.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Aber es ist auch in Ordnung zu sagen, dass Marge gestern dieses absolut faszinierende Buch über Bienen", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_271.wav", "doc_id": "PIZEXUFLAR.seg_271", "src_text": "Therefore, this motivates us to build a multi-modal instruction tuning dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "daher motiviert uns dies, ein Multimodalunterricht-Tuning-Datensatz zu bauen.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_364.wav", "doc_id": "gGbuDbHhyc.seg_364", "src_text": "Typically we only need 20 samples per class to attain high performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Normalerweise brauchen wir nur zwanzig Proben pro Klasse, um eine hohe Leistung zu erzielen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_857.wav", "doc_id": "GvEBWkLmuI.seg_857", "src_text": "So, while the generated personas have much higher rates of the lexicon words, the human-written ones have a much wider distribution of words, while the stereotype words that are in the generated personas are really just the words \"tall\" and \"athletic\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "gibt. So haben die generierten Personen wahrscheinlich höhere Raten der Luxuswörter, während die menschlich geschriebenen Personen eine viel breitere Verteilung der Wörter haben, während die stereotype Wörter, die in den generierten Personen vorkommen, wirklich nur die Wörter sind, die groß und athletisch sind.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_742.wav", "doc_id": "XejEJmgUmE.seg_742", "src_text": "So what we do is that to simulate these longer sequences, we revisit the data sets themselves and then we recreate sentences by choosing acceptable or unacceptable sentences from those datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir simulieren diese längeren Sequenzen, indem wir die Datensätze selbst besuchen und dann Sätze durch Sätze aus diesen Datensätzen erstellen, indem wir akzeptable oder inakzeptable Sätze aus diesen Datensätzen auswählen.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_506.wav", "doc_id": "dvGkKzmIaN.seg_506", "src_text": "Embedding as services is one of the services built upon large language models to assist various, NLP tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "natürlichen Sprachverarbeitung und -generierung. Einen der Dienste, die auf großen Sprachmodellen aufgebaut sind, um verschiedene NLP-Aufgaben zu unterstützen,", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_208.wav", "doc_id": "SLpqvupgvW.seg_208", "src_text": "But this is not realistic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "aber das ist nicht realistisch.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_49.wav", "doc_id": "TVCREhgqUP.seg_49", "src_text": "This is joint work with my advisors Alexander Koller and Ivan Titov.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "latente Permutationen verwenden. Das ist eine gemeinsame Arbeit mit meinen Beratern, Alexander Koller und Ivan Tito.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_489.wav", "doc_id": "SUkmfOTvGi.seg_489", "src_text": "And this shows us that adaptive overfitting in this case is not observed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und das zeigt uns, dass adaptive Überanpassung in diesem Fall nicht beobachtet wird.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_722.wav", "doc_id": "oaOHnMCwad.seg_722", "src_text": "And a good example of this is the Masakhani initiative.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und ein gutes Beispiel dafür", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_606.wav", "doc_id": "oeooqChmKK.seg_606", "src_text": "The resolution of a given pronoun requires two types of information.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diener ist. Für die Auflösung eines gegebenen Pronomen sind zwei Arten von Informationen erforderlich:", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_478.wav", "doc_id": "SUkmfOTvGi.seg_478", "src_text": "The first one is the model architecture.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der erste ist die Modellarchitektur.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_325.wav", "doc_id": "dJGfOSFgZO.seg_325", "src_text": "For each of the existing methods, we collected evaluations on eight of the most commonly measured aspects of dialogue, since this is the standard practice for evaluating chat models along multiple dimensions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Für jede der existierenden Methoden haben wir Bewertungen von acht der am häufigsten gemessenen Aspekte des Dialogs gesammelt, da dies die Standardpraxis für die Bewertung von Chat-Modellen in mehreren Dimensionen ist.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_827.wav", "doc_id": "WTTtiRKFZI.seg_827", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die Post-Sitzung.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_269.wav", "doc_id": "PIZEXUFLAR.seg_269", "src_text": "There exist more than 1600 language-only instruction tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Es gibt mehr als 1.600 Sprachunterrichtsaufgaben, aber", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_846.wav", "doc_id": "GvEBWkLmuI.seg_846", "src_text": "The second part is marked words, which is a method to identify the words that distinguish marked groups from unmarked ones, which I'll elaborate on shortly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "zweite Teil ist marked words, ein Methoden zur Identifizierung der Wörter, die markierte Gruppen von unmarkierten unterscheiden, über die ich mich in Kürze ausführlicher befassen werde.", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_480.wav", "doc_id": "SUkmfOTvGi.seg_480", "src_text": "The second ingredient is the model size.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das zweite Element ist die Modellgröße.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_52.wav", "doc_id": "TVCREhgqUP.seg_52", "src_text": "As usual, we have a training set of utterances.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wie gewohnt haben wir ein Trainingsset von Aussagen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_361.wav", "doc_id": "gGbuDbHhyc.seg_361", "src_text": "As shown in this figure, if there are no clean validation samples, then the trained models cannot generalize beyond the original weak labels, meaning that the training is pointless.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wie in dieser Abbildung zu sehen ist: Wenn es keine sauberen Validierungsmuster gibt, können die Trendmodelle nicht über die ursprünglichen Bit-Labels generalisiert werden. Dass die Doktrin sinnlos ist.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_176.wav", "doc_id": "SLpqvupgvW.seg_176", "src_text": "The cartoon has three speech bubbles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Der Cartoon hat drei Sprachblasen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_327.wav", "doc_id": "dJGfOSFgZO.seg_327", "src_text": "In addition, ABC-Eval labels are more predictive of the overall conversation quality compared to metrics produced by existing methods, as shown by this simple linear regression analysis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Darüber hinaus sind ABC-Eval-Labels prädiktiver für die Gesamtqualität der Konversation im Vergleich zu Metriken, die durch bestehende Methoden erzeugt werden, wie durch diese einfache lineare Regression-Analyse gezeigt", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_674.wav", "doc_id": "oaOHnMCwad.seg_674", "src_text": "Hi everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_793.wav", "doc_id": "WTTtiRKFZI.seg_793", "src_text": "Because then it can be moved to the position after the adjunct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "weil es dann nach dem Aufprall in", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_682.wav", "doc_id": "oaOHnMCwad.seg_682", "src_text": "This is an example of a design bias where we see systematic performance differences of technology between populations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies ist ein Beispiel für eine Designer-Voreiligung, wo wir systematische Leistungsdifferenzen von Technologie zwischen Populationen sehen. Die Design-Ideen sind", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_28.wav", "doc_id": "aQpIWggfCo.seg_28", "src_text": "With our method, InstructGPT can generate scripts of higher quality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Mit unserer Methode kann Insensibilität zu Schweiß von Haarfarbe führen.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_91.wav", "doc_id": "TVCREhgqUP.seg_91", "src_text": "If you want to learn more about our experiments and how we address these challenges, please have a look at our paper or come to our poster.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn Sie mehr über unsere Experimente erfahren möchten und wie wir diese Herausforderungen angehen, schauen Sie sich bitte unser Papier an oder schreiben Sie uns.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_676.wav", "doc_id": "oaOHnMCwad.seg_676", "src_text": "This work was done in collaboration with some folks at the University of Washington and the Allen Institute for AI, namely Sebastian Santy, Ronan Le Bras, Katharina Reinecke and Maarten Sap.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese Arbeit wurde in Zusammenarbeit mit einigen Leuten an der University of Washington und dem AI Institute, namentlich Sebastian Santillana, Ronan Laras, Caterina Arico und Martin Sapp,", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_638.wav", "doc_id": "FLkGnzVRew.seg_638", "src_text": "And they have a consonance relationship.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und sie haben eine Konsonanzbeziehung.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_668.wav", "doc_id": "FLkGnzVRew.seg_668", "src_text": "However, the annotators also find the examples difficult.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "aber die Annotatoren finden die Beispiele auch schwierig.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_773.wav", "doc_id": "WTTtiRKFZI.seg_773", "src_text": "So for example, in the universal dependencies, the structure of the coordination, Lisa, Bart, and Maggie, such that the first conjunct is the head of the whole coordinate structure.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "zum Beispiel in den Universal Dependencies, die Struktur der koordinierten Koordination Lisa, Bart und Maggie ist so, dass der erste Konjunkt der Kopf der gesamten koordinierten Struktur", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_499.wav", "doc_id": "SUkmfOTvGi.seg_499", "src_text": "Thank you so much.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_286.wav", "doc_id": "PIZEXUFLAR.seg_286", "src_text": "Each instance is randomly combined with one of its five instruction templates.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "jede Instanz wird zufällig mit einer ihrer fünf Anweisungstemplaten kombiniert.", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_753.wav", "doc_id": "XejEJmgUmE.seg_753", "src_text": "So how does the model do?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wie funktioniert das Modell?", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_629.wav", "doc_id": "oeooqChmKK.seg_629", "src_text": "Still, even the best-performing models seem to have difficulties with reliably integrating backward knowledge presented only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "scheinen selbst die besten Modelle Schwierigkeiten mit zuverlässig integriertem Rückwärts-Wissen zu haben, das nur zur Zeit der Inferenz präsentiert wird.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_394.wav", "doc_id": "WBLMIsdIrq.seg_394", "src_text": "In this work, we try to answer these two questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Projekt versuchen wir, diese beiden Fragen zu beantworten:", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_260.wav", "doc_id": "oYCKgTzTDy.seg_260", "src_text": "And et cetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "u. a. und", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_382.wav", "doc_id": "gGbuDbHhyc.seg_382", "src_text": "Thank you and enjoy the conference.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank und ich nehme an der Konferenz teil.", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_652.wav", "doc_id": "FLkGnzVRew.seg_652", "src_text": "To alleviate this, we experiment over combinations of transfer learning and active learning to annotate such that more dissonant samples can be collected over lesser annotation runs, lowering the overall annotation costs while improving dissonance detection.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Um dies zu vereinfachen, können die Kombinationen des Transfers und des aktiven Lernens so zusammengesetzt werden, dass mehr Dissonanzen erfasst werden können, was die Gesamtdissoziation durch die Verbesserung der Dissoziation senkt.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_145.wav", "doc_id": "wLqFAuDnKa.seg_145", "src_text": "So it's important to select the examples from high-quality translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ist wichtig, die Beispiele aus qualitativ hochwertigen Übersetzungen auszuwählen,", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_502.wav", "doc_id": "dvGkKzmIaN.seg_502", "src_text": "Are you copying my model?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Werbevideo über Papier zu machen,", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_415.wav", "doc_id": "WBLMIsdIrq.seg_415", "src_text": "For each of the five discourse phenomena we identified, we create taggers to automatically identify words that pertain to the phenomenon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Für jedes der fünf diskursiven Phänomene, die wir identifiziert haben, schaffen wir Tigrer, um Wörter, die dem Phänomen zugeordnet werden, automatisch zu identifizieren, und wir", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_110.wav", "doc_id": "uZBWfYjYnf.seg_110", "src_text": "This means that these three words will be emitted.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Das bedeutet, dass diese drei Wörter ausgelassen werden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_796.wav", "doc_id": "WTTtiRKFZI.seg_796", "src_text": "\"Marge read this absolutely fascinating book about bees yesterday.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "sind, das absolut faszinierende Buch über die Bisse gestern", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_98.wav", "doc_id": "uZBWfYjYnf.seg_98", "src_text": "And training and maintaining several models to reach different latency regimes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "verschiedene Optimierungsziele umfasst. „Training und Aufrechterhaltung von mehreren Modellen mit unterschiedlichen Latenzzeiten, z.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_697.wav", "doc_id": "oaOHnMCwad.seg_697", "src_text": "We then take the annotations by demographic and compare them to the models and datasets using a Pearson's R correlation score, and thus our framework actually differs from annotator disagreement literature by comparing end users with models and datasets, predictions and labels, as opposed to looking at just annotator agreement or modelling annotator distributions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann nehmen wir die Anmerkungen nach demografischen Gesichtspunkten vor und vergleichen sie mit den Modellen und Datensätzen unter Verwendung des Korrelationskoeffizienten von Pearson. Und so unterscheidet sich das Framework von der Annotator-Literatur, indem es Endbenutzer mit Modellen und Datensätzen vergleicht und nur Annotator-Vereinbarungen oder -Modelle betrachtet. Unsere Frameworks sind", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_321.wav", "doc_id": "dJGfOSFgZO.seg_321", "src_text": "ABC-Eval is capable of measuring the rates at which chat models will commit various thematic errors.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "in der aktuellen Literatur beziehen. A B C D E F G. Beispielsweise misst", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_238.wav", "doc_id": "oYCKgTzTDy.seg_238", "src_text": "And we also consider Cross-lingual Zero-shot and Few-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "auch die Übersetzung von Sprachcode-Zero-Shot- und Few-Shot-Übertragungen.", "score": 48.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_794.wav", "doc_id": "WTTtiRKFZI.seg_794", "src_text": "This is illustrated here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "werden kann, wie hier gezeigt.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_825.wav", "doc_id": "WTTtiRKFZI.seg_825", "src_text": "So see the paper for the full arguments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "also das Papier für die vollständige Vereinbarung und die Argumente an, es tut mir leid,", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_771.wav", "doc_id": "WTTtiRKFZI.seg_771", "src_text": "Hi, my name is Adam Przepiórkowski and this talk is about the Dependency Structure of Coordination.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo, mein Name ist Adam Schrödinger, und dieses Gespräch dreht sich um die Abhängigkeitsstruktur der Koordination.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_335.wav", "doc_id": "dJGfOSFgZO.seg_335", "src_text": "They produce irrelevant information in around 15% of the responses, and they contradict themselves or their partner around 10% of the time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Sie produzieren irrellevante Informationen in rund fünfzehn Prozent der Antworten, und sie widersprechen sich selbst oder ihrem Partner in rund zehn Prozent", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_376.wav", "doc_id": "gGbuDbHhyc.seg_376", "src_text": "For example, report if the model selection is done via clean validation samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "zum Beispiel, ob die Modellauswahl anhand von sauberen Validierungsmustern erfolgt.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_612.wav", "doc_id": "oeooqChmKK.seg_612", "src_text": "First, we have the typical setting: \"Background-Pretrain\", where background knowledge is assumed to be available at pretrain time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zuerst mussten wir die Kassetten aufnehmen. Kommen Sie, ich zeige Ihnen, wie es geht. Es wird angenommen, dass Hintergrundwissen zur Vorbereitungszeit verfügbar ist.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_720.wav", "doc_id": "oaOHnMCwad.seg_720", "src_text": "And the other is to do NLP research with the lens of perspectivism.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "eine Untersuchung der Perspektiven. Die", "score": 76.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_750.wav", "doc_id": "XejEJmgUmE.seg_750", "src_text": "And we can do the same for unacceptability case.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "können dasselbe für den Fall der Unannehmbarkeit tun.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_412.wav", "doc_id": "WBLMIsdIrq.seg_412", "src_text": "And finally, we look at different individual tokens that have high P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Und schließlich betrachten wir verschiedene einzelne Token, die eine hohe PXS haben,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_205.wav", "doc_id": "SLpqvupgvW.seg_205", "src_text": "The AltEntities Corpus has 6,000 alternative questions across three domains, and it has 42,000 indirect referring expressions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Das Altentitäts-Korpus hat 6000 alternative Fragen auf drei Domänen und 42.000 indirekte Referenzausdrücke; die", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_376.wav", "doc_id": "gGbuDbHhyc.seg_376", "src_text": "For example, report if the model selection is done via clean validation samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Beispiel berichten Sie, ob die Modellauswahl anhand sauberer Validierungsbeispiele durchgeführt wird.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_476.wav", "doc_id": "SUkmfOTvGi.seg_476", "src_text": "So what is needed for a good generalization?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Was ist also für eine gute Verallgemeinerung erforderlich? In", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_326.wav", "doc_id": "dJGfOSFgZO.seg_326", "src_text": "From our analysis of these evaluation results, we found that ABC-Eval behavior labels are overall more reliable than labels collected by existing methods, as measured by inter-annotator agreement on 100 doubly-labeled conversations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Aus unseren Analysen der Bewertungsergebnisse haben wir festgestellt, dass die Verhaltensetiketten von ABC-Eval (Behavioral Label) im Allgemeinen zuverlässiger sind als Etiketten, die von existierenden Methoden gesammelt wurden, wie beispielsweise durch ein internatürliches Abkommen über hundert zweifach beschriftete Gespräche.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_344.wav", "doc_id": "gGbuDbHhyc.seg_344", "src_text": "I'd like to begin with a brief introduction to weak supervision and weakly supervised learning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Ich möchte mit einer kurzen Einführung zu Wochenüberwachung und wöchentlichen Überwachung beginnen.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_768.wav", "doc_id": "XejEJmgUmE.seg_768", "src_text": "And the MPP evaluation the way that we do it currently with short and single sentence input, may not fully capture the language models abstract knowledge throughout the context window.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und die Mp-Bewertung, die Art und Weise, wie wir es korrekt mit kurzen und einfachen Sätzen machen, mag das abstrakte Wissen der Sprachmodelle im gesamten Kontext", "score": 41.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hi,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_293.wav", "doc_id": "PIZEXUFLAR.seg_293", "src_text": "Here is our main result.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier ist unser Hauptergebnis:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_488.wav", "doc_id": "SUkmfOTvGi.seg_488", "src_text": "This means that every unit of improvement that we made, on CoNLL-2003 translates to more than one unit improvement on CoNLL++ which means that there is no diminishing returns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "bedeutet, dass jede Verbesserung, die wir auf Coral 2003 vorgenommen haben, sich auf mehr als eine Verbesserung auf CoralPlus auswirkt. Dies bedeutet, dass es keine abnehmenden Renditen gibt.", "score": 51.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_841.wav", "doc_id": "GvEBWkLmuI.seg_841", "src_text": "And both of the women of color personas make references to ancestry while the white man persona has nothing of the sort.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und beide der farbigen Persönlichkeiten beziehen sich auf ihre Abstammung, während die Persönlichkeit des weißen Mannes nichts davon hat.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_725.wav", "doc_id": "oaOHnMCwad.seg_725", "src_text": "And so that concludes our presentation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und so ist diese Präsentation, aber", "score": 22.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_256.wav", "doc_id": "oYCKgTzTDy.seg_256", "src_text": "Pretraining on English natural language can significantly boost the performance of Few-shot on target natural languages, and we found multilingual language models such as Codex and BLOOM are still inadequate for cross-lingual semantic parsing tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "liefern können, was die Leistung von Sprachmodellen erheblich steigern kann. Und wir finden mehrsprachige Sprachmodelle wie Codas und Blau, die noch immer für die Übersetzung vieler Sprachen geeignet sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_481.wav", "doc_id": "SUkmfOTvGi.seg_481", "src_text": "We found that usually larger models lead to better generalization.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir haben herausgefunden, dass größere Modelle in der Regel zu einer besseren Generalisierung führen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_548.wav", "doc_id": "rISrKoXQCx.seg_548", "src_text": "So language models are trained on large scale web crawl data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Sprichwörter werden auf großen Web-Crowds-Daten trainiert.", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_369.wav", "doc_id": "gGbuDbHhyc.seg_369", "src_text": "As we can see from the figures, the vanilla model, termed FTw, initially underperforms more complicated WSL methods, like COSINE.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wie wir aus den Abbildungen sehen, unterperformt das Vallina-Modell, das als „FTW“ bezeichnet wird, zunächst komplexere WS-L-Methode wie Kösine.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_619.wav", "doc_id": "oeooqChmKK.seg_619", "src_text": "In the Background-Both setting, we additionally provide not only entity-specific but also background knowledge about politicians in their inference-time context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ist. Bei der Hintergrundbestimmung liefern wir nicht nur unspezifische, sondern auch Hintergrundwissen über Politiker im Kontext der Infiltration.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_38.wav", "doc_id": "aQpIWggfCo.seg_38", "src_text": "We find CoScript shows high pluralism in the generated specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wir stellen fest, dass CoScript einen Hypersyndrom in den generierten spezifischen Zielen zeigt;", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_119.wav", "doc_id": "uZBWfYjYnf.seg_119", "src_text": "If you want to discover more results, read our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn Sie mehr Ergebnisse entdecken möchten, lesen Sie unsere Akte", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_533.wav", "doc_id": "dvGkKzmIaN.seg_533", "src_text": "Then the provider requests the embeddings from the stealer's service with the data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann bittet der Anbieter um Einbettungen von dem ähnlichen Dienst mit der Datensammlung.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_99.wav", "doc_id": "uZBWfYjYnf.seg_99", "src_text": "For example, training a model with an average of one second latency and another one with two seconds latency, and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "z. B. das Training eines Modells mit einer durchschnittlichen Latenz von einer Sekunde und eines anderen mit einer Latenz von zwei Sekunden usw. Also, was", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_262.wav", "doc_id": "oYCKgTzTDy.seg_262", "src_text": "Thanks for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "fürs Zuhören.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_363.wav", "doc_id": "gGbuDbHhyc.seg_363", "src_text": "Our second finding is that increasing the number of clean validation samples will help WSL approaches to achieve better performance, as shown in the figure on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unser zweites Ergebnis ist, dass die Erhöhung der Anzahl der Clean-Validation-Samples die WSL-Ansätze bei besseren Leistungen unterstützen wird, wie in der Abbildung auf der linken Seite dargestellt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_648.wav", "doc_id": "FLkGnzVRew.seg_648", "src_text": "As can be seen here, dissonance was only found in 3.5% of the annotated pairs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wie man hier sehen kann, war Dissonanz nur in fünf Prozent der annotierten Paare zu", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_265.wav", "doc_id": "PIZEXUFLAR.seg_265", "src_text": "Recently, many studies have shown that instruction tuning enables large language models to perform on unseen tasks in a zero-shot manner by following natural instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In jüngster Zeit haben viele Studien gezeigt, dass die Anpassung von Anweisungen es großen Sprachmodellen ermöglicht, unsichtbare Aufgaben in einer vollständigen Art und Weise auszuführen, indem sie natürlichen Anweisungen folgen.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_415.wav", "doc_id": "WBLMIsdIrq.seg_415", "src_text": "For each of the five discourse phenomena we identified, we create taggers to automatically identify words that pertain to the phenomenon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Für jedes der fünf diskursphänomene, die wir identifiziert haben, haben wir Tiere erstellt, um Wörter zu identifizieren, die dem Phänomen zugeordnet", "score": 35.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_305.wav", "doc_id": "PIZEXUFLAR.seg_305", "src_text": "So one more thing, we are collecting a much larger multi-model instruction tuning dataset with around 150 additional vision language tasks and we will release them.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und noch etwas: Wir sammeln einen viel größeren Satz von Multimodell-Anweisungen mit ungefähr 150 zusätzlichen Sprachaufgaben und veröffentlichen sie.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_817.wav", "doc_id": "WTTtiRKFZI.seg_817", "src_text": "Here we have coordination of two verbs and there's no outsides, external governor.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "die Koordination von zwei Wörtern durch den äußeren Regierungsvertreter vorgenommen.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_455.wav", "doc_id": "hgIDlKNiFM.seg_455", "src_text": "We also observe that using more data translated to better performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und wir können auch erkennen, dass die Verwendung mehrerer Daten zu einer besseren Leistung führt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_802.wav", "doc_id": "WTTtiRKFZI.seg_802", "src_text": "When you swap these two constituents, the sum of these two dependencies becomes 6.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn man sich bewegt, wenn man diese beiden Konstituenten tauscht, wird der Großteil dieser beiden Abhängigkeiten zu", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_662.wav", "doc_id": "FLkGnzVRew.seg_662", "src_text": "We compare this to the other state-of-the-art AL strategies that are commonly used in the community.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir vergleichen dies mit den anderen Strategien des Staates der Kunst, die in der Gemeinschaft üblicherweise verwendet werden.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_645.wav", "doc_id": "FLkGnzVRew.seg_645", "src_text": "To the goal of creating a cognitive dissonance resource, we conducted a large scale annotation of dissonance relations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zum Ziel der Schaffung einer kognitiven Diskrepanzressource führten wir eine große Anzahl von Diskrepanzrelationen durch.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_31.wav", "doc_id": "aQpIWggfCo.seg_31", "src_text": "Creating the dataset is an essential step to this end.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Kreation des Datenbestands ist ein entscheidender Schritt zu seinem Ende.", "score": 84.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_45.wav", "doc_id": "aQpIWggfCo.seg_45", "src_text": "Thanks for your time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank für Ihre Zeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_60.wav", "doc_id": "TVCREhgqUP.seg_60", "src_text": "A popular method to address this is to integrate trees into the models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ansatz, um dies anzugehen, besteht darin, Bäume in die Modelle zu integrieren.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_534.wav", "doc_id": "dvGkKzmIaN.seg_534", "src_text": "The cosine and L2 similarity between the requested embedding and the target embedding are computed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Cosine- und L2-Similität zwischen dem angeforderten Embedding und dem Ziel-Embedding werden berechnet.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_733.wav", "doc_id": "XejEJmgUmE.seg_733", "src_text": "So the minimal pair paradigm basically evaluates language models on top of acceptability judgments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das Minimal-Paar-Paradigma bewertet Sprachmodelle grundsätzlich auf der Grundlage von Akzeptabilitätsurteilen,", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_110.wav", "doc_id": "uZBWfYjYnf.seg_110", "src_text": "This means that these three words will be emitted.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Das bedeutet, dass diese drei Wörter ausgesendet werden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_104.wav", "doc_id": "uZBWfYjYnf.seg_104", "src_text": "That is the cross-attention mechanism, and you can see an example on the right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist ein Kreuzmechanismus, und Sie können ein Beispiel auf der rechten Seite sehen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_643.wav", "doc_id": "FLkGnzVRew.seg_643", "src_text": "Studying dissonance expressed in language can also be beneficial in understanding extremism and polarization of vulnerable groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Untersuchung der Ausdrucksweise der Distanz kann auch von Vorteil sein, um Extremismus und Polarisierung von marginalisierten Gruppen zu verstehen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_55.wav", "doc_id": "TVCREhgqUP.seg_55", "src_text": "These utterances are paired with logical forms that represent core aspects of their meaning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Antiquitäten, die mit logischen Formen versehen waren, zeigten den Kernaspekt ihrer Bedeutung.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_780.wav", "doc_id": "WTTtiRKFZI.seg_780", "src_text": "The conjunction headed approach assumed in Prague dependency treebanks, where coordinate structures are headed by the conjunction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Konjunkturprozesse in Abhängigkeit von den Trägern, wo die koordinierten Strukturen von der Konjunktur getragen werden.", "score": 32.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_605.wav", "doc_id": "oeooqChmKK.seg_605", "src_text": "The task here is to identify the correct entity that the pronoun \"he\" refers to, which in this case is Servin.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Aufgabe hier ist es, die richtige Entität zu identifizieren, auf die das Pronomen verweist, was in diesem Fall Serwin", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_568.wav", "doc_id": "rISrKoXQCx.seg_568", "src_text": "We can see that language models generally had a political leaning that is further away from the centre after 2017.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir können die Sprachmodelle im Allgemeinen als politisiert bezeichnen, die nach dem siebzehnten Jahrhundert entstanden sind,", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_72.wav", "doc_id": "TVCREhgqUP.seg_72", "src_text": "We introduce a new method to predict the permutation that does not put any hard constraints on the possible permutations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir stellen eine neue Methode vor, um eine Permutation vorherzusagen, die keine harten Einschränkungen auf die möglichen Permutationen", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_15.wav", "doc_id": "aQpIWggfCo.seg_15", "src_text": "We find that all language models achieve unsatisfactory results on planning for specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellen fest, dass alle Lialy-Modelle bei der Planung für bestimmte Ziele unzufriedenstellende Ergebnisse erzielen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_328.wav", "doc_id": "dJGfOSFgZO.seg_328", "src_text": "For example, you can see how measuring the proportion of turns with self and partner contradictions explains 5% and 10% of conversation quality, respectively, while the average Likert consistency scores explain only 4% or less.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "mit sich selbst und Partnerkontroversen mit fünf Prozent und zehn Prozent der Konversationsqualität messen lässt, während die durchschnittlichen Liker-Konsistenzwerte nur vier Prozent oder weniger betragen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_33.wav", "doc_id": "aQpIWggfCo.seg_33", "src_text": "Thus, we follow the idea of symbolic knowledge distillation, to distil constrained language planning datasets from large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Daher folgen wir der Idee der symbolischen Wissensdestillation, um begrenzte Sprachplanungsdaten aus Sprachmodellen zu extrahieren.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_873.wav", "doc_id": "GvEBWkLmuI.seg_873", "src_text": "So based on these patterns, we conclude with three recommendations for model owners.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "können wir drei Empfehlungen für Modelleigentümer", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_837.wav", "doc_id": "GvEBWkLmuI.seg_837", "src_text": "And we can immediately see that this is very generalizable to any demographic because we can just specify whatever identity marker that we want into this prompt.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und wir können sofort sehen, dass dies sehr allgemein ist, denn wir können nur die Identität angeben, die wir in diesem Promt haben.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_447.wav", "doc_id": "hgIDlKNiFM.seg_447", "src_text": "In addition to this comparison, we introduced three models trained on continual pre-training to analyze the impact of pre-training strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Darüber hinaus zur Vergleichbarkeit führen wir drei Modell-Trainingsstrategien auf kontinuierliche Pretraining ein, um die Auswirkungen der Pretrainingsstrategie zu analysieren.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_577.wav", "doc_id": "rISrKoXQCx.seg_577", "src_text": "For example, if right-leaning language models were to be fine-tuned on hate speech or misinformation or whatever and deployed to a popular social media platform, this would mean that, people with opposite political opinions might be marginalised and hate speech targeting minority groups might just run rampant without any control.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Beispielsweise müssen Sprachmodelle, die eine Sprachverarbeitung durchführen, auf einer beliebigen sozialen Plattform bereitgestellt werden. Dies würde bedeuten, dass Menschen mit gegensätzlichen politischen Meinungen marginalisiert werden könnten, und die Hassrede gegen Minderheiten könnte sich ohne jede", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_164.wav", "doc_id": "SLpqvupgvW.seg_164", "src_text": "\"Did you mean 'Easy on Me' or 'I Gotta Feeling'?\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Frage: Meinten Sie es mit „easy on me“ oder habe ich eine Gefühle?", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_405.wav", "doc_id": "WBLMIsdIrq.seg_405", "src_text": "First, we look at part-of-speech tags that have high mean P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zunächst sehen wir uns die Sprechtabellen an, die hohe Werte von „psx“ haben.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_495.wav", "doc_id": "SUkmfOTvGi.seg_495", "src_text": "So going back to the question that we raised in the title of our paper Do CoNLL-2003 taggers still work in 2023?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um also auf die Frage zurückzukommen, die wir im Titel unseres Papier gestellt haben: Funktionieren die Carnell 2003-Tags", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_71.wav", "doc_id": "TVCREhgqUP.seg_71", "src_text": "That's why in the second step we use another model to predict a permutation to put them into the right order.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Deshalb verwenden wir im zweiten Schritt ein anderes Modell, um eine Permutation vorherzusagen, um sie in die richtige Reihenfolge zu bringen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_392.wav", "doc_id": "WBLMIsdIrq.seg_392", "src_text": "Firstly because only a small portion of translations depend on context which makes corpus-level metrics like BLEU unable to capture these translations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ist jedoch ziemlich schwierig, weil nur ein kleiner Teil der Übersetzungen vom Kontext abhängt, was die korpusbasierten Metriken wie Blue nicht fassen können.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_577.wav", "doc_id": "rISrKoXQCx.seg_577", "src_text": "For example, if right-leaning language models were to be fine-tuned on hate speech or misinformation or whatever and deployed to a popular social media platform, this would mean that, people with opposite political opinions might be marginalised and hate speech targeting minority groups might just run rampant without any control.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "dringend ist. Zum Beispiel sollten richtlinienkonforme Sprachmodelle in der Lage sein, eine feine Sprache und Informationen zu verfeinern und einen beliebten sozialen Medien-Plattform zu verwenden. Dies würde bedeuten, dass Menschen mit unterschiedlichen politischen Meinungen marginalisiert werden könnten, und dass der Hass, der sich auf Minderheitengruppen", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_352.wav", "doc_id": "gGbuDbHhyc.seg_352", "src_text": "We can't stop on this problem setting, but this implies that additional manual annotations are required in weakly supervised learning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "an dieser Problemstellung, aber dies impliziert, dass zusätzliche manuelle Anmerkungen in der wöchentlichen Überwachung erforderlich sind.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_794.wav", "doc_id": "WTTtiRKFZI.seg_794", "src_text": "This is illustrated here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "die Position bewegt werden", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_616.wav", "doc_id": "oeooqChmKK.seg_616", "src_text": "For example, because new occupations have developed since the time of pretraining.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "beispielsweise neue Berufe seit dem Training entwickelt haben.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_742.wav", "doc_id": "XejEJmgUmE.seg_742", "src_text": "So what we do is that to simulate these longer sequences, we revisit the data sets themselves and then we recreate sentences by choosing acceptable or unacceptable sentences from those datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "was wir tun, ist, diese längeren Sequenzen zu simulieren, indem wir die Datenbanken selbst besuchen und dann Sätze neu erstellen, indem wir Sätze auswählen, die entweder akzeptabel oder unakzeptabel sind, aus diesen Datenbanken. Zum", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_260.wav", "doc_id": "oYCKgTzTDy.seg_260", "src_text": "And et cetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "vielen Dank für", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_640.wav", "doc_id": "FLkGnzVRew.seg_640", "src_text": "So why does this matter?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Warum ist das wichtig?", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_278.wav", "doc_id": "PIZEXUFLAR.seg_278", "src_text": "In which the input text, images, instructions and bounding boxes are represented in the same token space.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In dem die Eingabedaten, Bilder, Anweisungen und Grenzboxen im selben Tokenraum dargestellt werden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_476.wav", "doc_id": "SUkmfOTvGi.seg_476", "src_text": "So what is needed for a good generalization?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was braucht man also für eine gute Verallgemeinerung? Durch unsere", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_250.wav", "doc_id": "oYCKgTzTDy.seg_250", "src_text": "In this figure, the blue line is Cross-lingual Few-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In dieser Figur ist die blaue Linie eine krosssprachige Füllübertragung, die orangene Linie eine", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_178.wav", "doc_id": "SLpqvupgvW.seg_178", "src_text": "And with that, Bob sets the dialogue context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und mit diesem Satz setzt Bob den Dialog fort.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_692.wav", "doc_id": "oaOHnMCwad.seg_692", "src_text": "We do this through our framework NLPositionality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir tun dies durch unser Framework \"NL Positionality\".", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_418.wav", "doc_id": "WBLMIsdIrq.seg_418", "src_text": "We then use the MuDA tagger, by applying the tagger on a parallel corpus that we want to use for evaluation and we apply our translation metrics of choice on the context-dependent examples that the MuDA tagger has identified.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann verwenden wir den Muda-Tagger, indem wir den Tagger auf den parallelen Korpus anwenden, den wir für die Auswertung verwenden möchten, und unsere Übersetzungsmetriken der Wahl auf die kontextspezifischen Beispiele, die der Muda-Tagger identifiziert hat.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_753.wav", "doc_id": "XejEJmgUmE.seg_753", "src_text": "So how does the model do?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wie sieht es mit", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_365.wav", "doc_id": "gGbuDbHhyc.seg_365", "src_text": "But that's not the end of the story, because if we either way decide to access clean samples, then training on them directly will even achieve better performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Aber das ist noch nicht das Ende der Geschichte, denn wenn wir uns für die Zugriff auf saubere Proben entscheiden, dann wird das Training auf ihnen direkt sogar noch bessere Leistung erzielen.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_82.wav", "doc_id": "TVCREhgqUP.seg_82", "src_text": "Some other kinds of structural generalization remain very challenging, though.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Andere Arten der strukturellen Generierung erinnern daran, sehr herausfordernd zu sein.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_102.wav", "doc_id": "uZBWfYjYnf.seg_102", "src_text": "Use only one model for every latency regime and handle latency through specific parameters.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "verwenden Sie nur ein Modell für jeden Latenzregime und handhaben Sie Latenz durch spezifische Parameter:", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_111.wav", "doc_id": "uZBWfYjYnf.seg_111", "src_text": "If we look at the main results of EDAtt, we'll plot the simultaneous speech translation results on graphs in which we have BLEU on one side that measures the translation quality, and average lagging that is the latency measure, and we also consider the computational aware average lagging that accounts for the model's computational times to predict the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn man sich die wichtigsten Ergebnisse davon anschaut, dann ist das so. Wir klappen die simultane Sprachübersetzungsergebnisse auf Grafiken, auf denen wir auf einer Seite Blau haben, das die Übersetzungskvalität und das durchschnittliche Leiden Das ist die Latenzzeitmaßnahme, und wir betrachten auch das computergestützte durchschnittliche Leistungsmangel, das für die computergestützte Zeit zur Vorhersage der Ausgabe der Modelle verantwortlich ist.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_364.wav", "doc_id": "gGbuDbHhyc.seg_364", "src_text": "Typically we only need 20 samples per class to attain high performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Normalerweise benötigen wir nur zwanzig Proben pro Klasse, um eine hohe Leistung zu erzielen.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_397.wav", "doc_id": "WBLMIsdIrq.seg_397", "src_text": "To answer the first question, we started by measuring how much a word depends on context during translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um die erste Frage zu beantworten, begannen wir damit, zu messen, wie viel das Wort von dem Kontext abhängt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_139.wav", "doc_id": "wLqFAuDnKa.seg_139", "src_text": "So in this example here, where we perform translation from German into English, the German sentences, the source sentences, are marked with German colon and the English translations with English colon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In diesem Beispiel, wo wir eine Übersetzung von Deutsch ins Englisch vornehmen, sind die deutschen Sätze mit deutscher Klammer und die englischen Sätze mit englischer Klammer gekennzeichnet.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_337.wav", "doc_id": "dJGfOSFgZO.seg_337", "src_text": "However, this is all the more reason to pursue reliable and precise evaluation metrics for comparing models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies ist jedoch der Hauptgrund, um zuverlässige und präzise Bewertungsmaßstäbe für die Vergleich von Modellen zu verwenden.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_599.wav", "doc_id": "oeooqChmKK.seg_599", "src_text": "We evaluate the data set with human study participants and established coreference resolution models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wir bewerten das Datensatz mit menschlichen Studienteilnehmern und etablierten Korreferenz-Resolution-Modellen. Hier ist", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_73.wav", "doc_id": "TVCREhgqUP.seg_73", "src_text": "This makes our approach quite flexible and expressive.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "hat, was unseren Ansatz sehr flexibel und ausdrucksstark macht.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_12.wav", "doc_id": "aQpIWggfCo.seg_12", "src_text": "As shown in the table, we extend the abstract goals with multi-faceted constraints for human-in-the-loop data acquisition using InstructGPT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wie in der Tabelle dargestellt, wobei wir die abstrakten Ziele mit mehrphasigen Einschränkungen für die menschliche Datenerfassung erweitern.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_821.wav", "doc_id": "WTTtiRKFZI.seg_821", "src_text": "So I'll concentrate on the right one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "das richtige ist. Was wir hier sagen, ist,", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_67.wav", "doc_id": "TVCREhgqUP.seg_67", "src_text": "For the first time, we show strong generalization to deeper recursion without relying on trees.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zum ersten Mal werden wir eine starke Generalisierung zeigen, um eine Rekursion ohne Reling auf Tönen zu erzeugen.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_416.wav", "doc_id": "WBLMIsdIrq.seg_416", "src_text": "And we called our tagger the Multilingual Discourse-Aware, or MuDA tagger.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und wir nennen unsere Tags „multifunktionale Diskurse“ oder „multifunktionale Diskurse“.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_636.wav", "doc_id": "FLkGnzVRew.seg_636", "src_text": "This belief and action are inconsistent, and they are in dissonance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Diese Überzeugung und Handlung sind inkonsistent und widersprüchlich. Weiterhin zu erwähnen,", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_594.wav", "doc_id": "oeooqChmKK.seg_594", "src_text": "For example, in the sentence, \"John saw the newly elected president on TV.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zum Beispiel, in der Verurteilung, sah John den neu gewählten Präsidenten auf dem Fernsehen.", "score": 34.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_565.wav", "doc_id": "rISrKoXQCx.seg_565", "src_text": "And we also try to investigate whether language models can pick up the polarisation that's prevalent in our modern society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir versuchen auch, die Sprachmodelle zu untersuchen, die die Polarisierung erfassen können, die in unserer modernen Gesellschaft vorherrscht. Wir", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_538.wav", "doc_id": "dvGkKzmIaN.seg_538", "src_text": "We assume the provider apply wiki text data set to count word frequency.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "gehen davon aus, dass der Anbieter den Datensatz WikiText verwendet, um Wörter mit einer bestimmten Häufigkeit zu zählen.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_455.wav", "doc_id": "hgIDlKNiFM.seg_455", "src_text": "We also observe that using more data translated to better performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "können wir diese Daten aus heterogenen Quellen erhalten, was zu besseren Leistungen führt,", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_106.wav", "doc_id": "uZBWfYjYnf.seg_106", "src_text": "A word is emitted if the attention is not concentrated, that is, its sum is below a certain threshold alpha towards the last lambda speech frames, meaning that the received information is enough stable.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ein Wort wird gesendet, wenn die Aufmerksamkeit nicht konzentriert ist, d. h. wenn die Summe unter einem bestimmten Schwellenwert Alpha liegt, was bedeutet, dass die empfangene Information genug stabil ist,", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_171.wav", "doc_id": "SLpqvupgvW.seg_171", "src_text": "Here are some examples of indirect references for example, \"the newer one\" or \"the song that's not energetic.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier sind einige Beispiele für direkte Unterschiede, zum Beispiel die neueren oder die nicht energiegeladenen Songs.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_804.wav", "doc_id": "WTTtiRKFZI.seg_804", "src_text": "That's why this sounds quite okay.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "klingt, als wäre es ganz in", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_603.wav", "doc_id": "oeooqChmKK.seg_603", "src_text": "Servin and Kea met at a park.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Bäcker; Serwin und Kiah trafen sich in einem Park,", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_572.wav", "doc_id": "rISrKoXQCx.seg_572", "src_text": "For example, for hate speech detection, left-leaning language models are better at detecting hate speech targeting socially minority groups, however are worse at detecting hate speech targeting more powerful groups in our society.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "können, dass zum Beispiel für die Erkennung von Hassrede, linke Sprachmodelle besser sind, um Hassrede in sozialen Minderheiten zu erkennen, aber schlechter, um Hassrede zu erkennen, die sich auf soziale Minderheiten bezieht. Mehr Macht für Gruppen in unserer Gesellschaft und,", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_634.wav", "doc_id": "FLkGnzVRew.seg_634", "src_text": "We begin by defining cognitive dissonance and why it is an important problem to study in language.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir beginnen mit der Definition der kognitiven Dissonanz und warum es ein wichtiges Problem ist, das man in der Sprache studieren muss.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_337.wav", "doc_id": "dJGfOSFgZO.seg_337", "src_text": "However, this is all the more reason to pursue reliable and precise evaluation metrics for comparing models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies ist jedoch der Hauptgrund, um zuverlässige und präzise Bewertungsmetriken für die Vergleichbarkeit von Modellen zu verfolgen.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_193.wav", "doc_id": "SLpqvupgvW.seg_193", "src_text": "And finally when they have similar info boxes or attributes on Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "mit ähnlichen Beschreibungen auf Wikipedia. Wenn sie ähnliche Infoboxen oder Attribute auf Wikipedia haben,", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_563.wav", "doc_id": "rISrKoXQCx.seg_563", "src_text": "By further pretraining language models on such partisan corpora we can see that the ideological coordinates of the language model also correspondingly shift.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "können wir sehen, dass die ideologischen Korrespondenzen des Sprachmodells auch korrespondieren.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_201.wav", "doc_id": "SLpqvupgvW.seg_201", "src_text": "Then, we asked the annotators to pick one of these entities, for example, here's the first one, and describe them using three to five indirect referring expressions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dann bitten wir die Korrektoren, eine dieser Entitäten auszuwählen, zum Beispiel die erste, und sie mit drei bis fünf indirekten Bezugsausdrücken zu beschreiben.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_454.wav", "doc_id": "hgIDlKNiFM.seg_454", "src_text": "However, we can observe that data from heterogeneous sources appear to be more versatile.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "hervor, dass das Modell die Aufgabe mit den Daten der gleichen Art am besten erfüllt. Allerdings", "score": 49.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_628.wav", "doc_id": "oeooqChmKK.seg_628", "src_text": "However, with task-specific training, some models successfully integrate knowledge from multiple sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Mit spezifischem Training integrieren einige Modelle jedoch erfolgreich Wissen aus mehreren Quellen. Trotzdem", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_607.wav", "doc_id": "oeooqChmKK.seg_607", "src_text": "First, entity-specific knowledge such as \"Servin is a judge.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zunächst ist es wichtig, dass man spezifische", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_639.wav", "doc_id": "FLkGnzVRew.seg_639", "src_text": "While dissonance is a very common phenomenon we experienced in daily decision making, they are really rare to find expressed in language among other kinds of discourse relations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ein sehr häufiges Phänomen, das wir in der täglichen Entscheidungsfindung erleben, und sie sind wirklich bereit, sich in einer anderen Sprache auszudrücken. Warum das so ist,", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_650.wav", "doc_id": "FLkGnzVRew.seg_650", "src_text": "To no surprise, the classifier performed not much better than chance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "von Diskursen. Kein Wunder, dass der Klassifizierer nicht viel besser abschneidet. Aufgrund der geringen", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_360.wav", "doc_id": "gGbuDbHhyc.seg_360", "src_text": "Otherwise, there is a large performance drop.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ansonsten gibt es einen großen Leistungsabfall,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_308.wav", "doc_id": "dJGfOSFgZO.seg_308", "src_text": "Hello, I'm James Finch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo, ich bin James Finch,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_721.wav", "doc_id": "oaOHnMCwad.seg_721", "src_text": "Our third recommendation is to build specialised datasets and models within 4 specific communities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unsere dritte Empfehlung ist, spezielle Datenmodelle mit vier speziellen Gemeinschaften zu bauen,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_802.wav", "doc_id": "WTTtiRKFZI.seg_802", "src_text": "When you swap these two constituents, the sum of these two dependencies becomes 6.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und diese beiden Bestandteile verschieben, wird die Summe dieser beiden Abhängigkeiten sechs", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_251.wav", "doc_id": "oYCKgTzTDy.seg_251", "src_text": "The orange line is Cross-lingual Zero-shot transfer.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "orangefarbene Linie die Kreuzsprachübertragung Null und die", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_341.wav", "doc_id": "gGbuDbHhyc.seg_341", "src_text": "Hello, I am Dawei, a PhD student at Saarland University in Germany.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo, ich bin Dawei, ein Doktorand an der Universität in Stuttgart, Deutschland.", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_545.wav", "doc_id": "dvGkKzmIaN.seg_545", "src_text": "Welcome to discuss with us.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir werden mit uns darüber sprechen.", "score": 42.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_861.wav", "doc_id": "GvEBWkLmuI.seg_861", "src_text": "In our analysis, we reveal how these seemingly positive portrayals reflect harmful patterns.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In unserer Analyse stellen wir fest, wie diese scheinbar positiven Porträts schädliche Muster widerspiegeln.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_691.wav", "doc_id": "oaOHnMCwad.seg_691", "src_text": "So to study data set and model positionality, we actually compare the annotations with real users with existing datasets and models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Um also die Daten- und Modellpositionen zu studieren, vergleichen wir die Anmerkungen von echten Benutzern mit existierenden Datensätzen und Modellen.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_309.wav", "doc_id": "dJGfOSFgZO.seg_309", "src_text": "And I'm Sarah Finch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und ich bin Sarah Finch,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_519.wav", "doc_id": "dvGkKzmIaN.seg_519", "src_text": "Then let me introduce the details of our embedding marker.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dann lassen Sie mich die Details unserer Embedding-Marker vorstellen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_501.wav", "doc_id": "dvGkKzmIaN.seg_501", "src_text": "It's my pleasure to give a short advertisement video of our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Es ist mir ein Vergnügen, ein kurzes", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_211.wav", "doc_id": "SLpqvupgvW.seg_211", "src_text": "If the language model has access only to entity names, then the accuracy is only 60%, so there's a lot of room for improvement.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wenn. Das Sprachmodell hat nur Zugriff auf Entitätsnamen, und die Genauigkeit beträgt nur 60 %, also gibt es viel Raum für Verbesserungen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_452.wav", "doc_id": "hgIDlKNiFM.seg_452", "src_text": "These models are compared to six baseline models which are CamemBERT OSCAR 138 GB, CamemBERT OSCAR 4 GB, CamemBERT CCNET 4 GB, PubMedBERT, BioBERT, and ClinicalBERT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_208.wav", "doc_id": "SLpqvupgvW.seg_208", "src_text": "But this is not realistic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "das ist nicht realistisch.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_615.wav", "doc_id": "oeooqChmKK.seg_615", "src_text": "This last setting is especially interesting, since it simulates the case where the background knowledge necessary to solve a task is not part of the pretrain data of models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die letzte Einstellung ist besonders interessant, da sie den Fall simuliert, in dem die erforderliche Hintergrundwissen, um eine Aufgabe zu lösen, nicht Teil der vorbereitenden Daten von Modellen ist.", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_697.wav", "doc_id": "oaOHnMCwad.seg_697", "src_text": "We then take the annotations by demographic and compare them to the models and datasets using a Pearson's R correlation score, and thus our framework actually differs from annotator disagreement literature by comparing end users with models and datasets, predictions and labels, as opposed to looking at just annotator agreement or modelling annotator distributions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir nehmen dann die Annotationen nach demografischen Merkmalen und vergleichen sie mit den Modellen und Datensätzen mithilfe der Pearson-R-Korrelationskoeffizienten. Und so unterscheidet sich unser Framework tatsächlich von der Literatur zur annotatorischen Übereinstimmung, indem wir Endnutzer mit Modellen und Datensätzen, Vorhersagen und Etiketten vergleichen, anstatt uns nur auf die annotatorische Übereinstimmung oder das Modellieren der annotatorischen Verteilungen zu konzentrieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_20.wav", "doc_id": "aQpIWggfCo.seg_20", "src_text": "Previous studies have shown that the output quality of language models falls in high variance, leading to bad performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vorherige Studien haben gezeigt, dass die Ausgabelösung von Larenium-Modellen in hohen Werten liegt, was zu schlechter Leistung führt,", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_130.wav", "doc_id": "wLqFAuDnKa.seg_130", "src_text": "And we compared to state-of-the-art systems, so the best performing system, so the WMT evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir vergleichen zwei State-of-the-Art-Systeme, die besten Leistungssysteme, gemäß der WMTE-Bewertung.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_451.wav", "doc_id": "hgIDlKNiFM.seg_451", "src_text": "To evaluate our seven models, we gather data for public and private downstream tasks such as named entity recognition, classification, part-of-speech tagging, and question answering.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine Hundertachtundzwanzig-Gigabyte-Zeile, eine Zeile eine", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_815.wav", "doc_id": "WTTtiRKFZI.seg_815", "src_text": "So the governor is on the left in this example \"I saw Bart and Lisa\" so is the governor is on the left.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "diesem Beispiel auf der linken Seite, also ist der Gouverneur auf der linken Seite.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_662.wav", "doc_id": "FLkGnzVRew.seg_662", "src_text": "We compare this to the other state-of-the-art AL strategies that are commonly used in the community.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir vergleichen dies mit den anderen Staaten der Kunststrategien, die in der Gemeinschaft üblicherweise verwendet werden.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_775.wav", "doc_id": "WTTtiRKFZI.seg_775", "src_text": "A similar approach is assumed in Igor Mel'čuk's meaning text theory, where again, the whole coordinate structure is headed by the first conjuct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "1 in Igor Milchucks Bedeutungstexttheorie, wo wiederum die gesamte koordinierte Struktur vom ersten Konjunkt angeführt wird,", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_862.wav", "doc_id": "GvEBWkLmuI.seg_862", "src_text": "First, from our groups, the top words include things like \"culture\", \"tradition\", \"proud\", and \"exotic\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zunächst umfassen die oberen Wörter für markierte Gruppen Dinge wie Kultur, Tradition, stolz und exotisch", "score": 68.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_270.wav", "doc_id": "PIZEXUFLAR.seg_270", "src_text": "However, there is no large-scale publicly-available multi-modal instruction task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Sprachanweisungen, aber es gibt keine öffentlich zugängliche multimodale Anweisung. Daher motiviert", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_846.wav", "doc_id": "GvEBWkLmuI.seg_846", "src_text": "The second part is marked words, which is a method to identify the words that distinguish marked groups from unmarked ones, which I'll elaborate on shortly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Der zweite Teil ist Markierte Wörter, die eine Methode darstellen, um die Wörter zu identifizieren, die Markengruppen von unmarkierten unterscheiden, die ich kurz vorbereitet habe.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_137.wav", "doc_id": "wLqFAuDnKa.seg_137", "src_text": "So, it's important to select a good prompting strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "gehen, daher ist es wichtig, die richtige Promoting-Strategie auszuwählen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_672.wav", "doc_id": "FLkGnzVRew.seg_672", "src_text": "Feel free to get in touch with us if you have any questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ich danke Ihnen, dass Sie Zeit haben, sich damit auseinanderzusetzen.", "score": 39.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_345.wav", "doc_id": "gGbuDbHhyc.seg_345", "src_text": "In weak supervision, you do not manually label the data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Bei schwacher Überwachung kennzeichnen wir die Daten nicht manuell, sondern verwenden stattdessen", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_747.wav", "doc_id": "XejEJmgUmE.seg_747", "src_text": "And we can also do the same by choosing sentences from a different subset or a different data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und wir können dasselbe auch tun, indem wir Sätze aus einer anderen Teilmenge oder einem anderen Datensatz auswählen,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_316.wav", "doc_id": "dJGfOSFgZO.seg_316", "src_text": "One approach is to simply ask human judges to evaluate several dimensions of dialogue quality, such as the relevance of model responses using existing comparative or Likert scale methods.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Eine Herangehensweise ist es, einfach menschliche Richter zu bitten, mehrere Dimensionen der Qualität des Dialogs zu bewerten, wie die Relevanz von Modellantworten, wobei es existierende vergleichende oder lizenzfreie Skalierungsmethoden verwendet.", "score": 87.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_759.wav", "doc_id": "XejEJmgUmE.seg_759", "src_text": "And there we see that the MPP judgments either increase or decrease significantly when you add either acceptable prefixes or unacceptable prefixes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und sehen, dass die MP-P-Judikate entweder stark an- oder abnehmen, wenn man entweder akzeptable oder unakzeptable Präfixe hinzufügt.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_718.wav", "doc_id": "oaOHnMCwad.seg_718", "src_text": "So we have a few recommendations for this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also haben wir einige Empfehlungen", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_536.wav", "doc_id": "dvGkKzmIaN.seg_536", "src_text": "Meanwhile, we also apply KS test and use its p-value as the third metric.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "L2 definiert. leichzeitig wenden wir auch den k-Test an und verwenden seinen p-Wert als dritte Metrik.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_764.wav", "doc_id": "XejEJmgUmE.seg_764", "src_text": "And after doing like several of these perturbations, we find that none of these noises are actually making the model like change its course in terms of how it shows us the MPP judgement print.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "nachdem wir so einige dieser Störungen durchgeführt haben, stellen wir fest, dass keines dieser Rauschen tatsächlich die Modelle ändert. It quotes in terms of how it shows us the MP", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_850.wav", "doc_id": "GvEBWkLmuI.seg_850", "src_text": "So when people are describing a warrior who is a woman, they'll usually actually specify \"woman warrior\" and mark the term with \"woman\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "„Krieger“ in Verbindung gebracht, so dass, wenn jemand eine Kriegerin beschreibt, die normalerweise tatsächlich eine Kriegerin ist, sie normalerweise eine Kriegerin bezeichnet und den Begriff mit Kriegerin", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_811.wav", "doc_id": "WTTtiRKFZI.seg_811", "src_text": "So when the difference between the lengths of the two conjuncts grows, the shorter conjunct prefers to be the first one, stronger, right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wächst der Unterschied zwischen den Längen der beiden Konjunktionen, und die kürzere Konjunktion ist die", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_92.wav", "doc_id": "uZBWfYjYnf.seg_92", "src_text": "Hi, I'm Sara Papi from the University of Trento and Foundazione Bruno Kessler and I will briefly introduce the \"Attention as a Guide for Simultaneous Speech Translation\" paper, that is a joint work with Matteo Negri and Marco Turchi.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hallo, ich bin Sara Papi von der Universität von Trento und der Bruno Kessler Stiftung und ich werde die Aufmerksamkeit als eine Leitfäden kurz vorstellen. Für das Simultaneous Speech Translation Paper ist es eine gemeinsame Arbeit mit Matteo Negri und Marco Turchi.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_838.wav", "doc_id": "GvEBWkLmuI.seg_838", "src_text": "So here are some example generations from GPT-4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hier sind einige Beispielgenerationen von GpT.", "score": 69.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_298.wav", "doc_id": "PIZEXUFLAR.seg_298", "src_text": "We use one instruction versus 5 instruction.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir haben eine Anweisung gegenüber fünf Anweisungen verwendet, da", "score": 77.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_522.wav", "doc_id": "dvGkKzmIaN.seg_522", "src_text": "Before these main steps, we first select a trigger set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "erfolgen, wählen wir zunächst ein Trigger-Set aus.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_97.wav", "doc_id": "uZBWfYjYnf.seg_97", "src_text": "Long and complicated training procedures, for example, training involving different optimization objectives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und komplizierte Trainingsverfahren, z. B. das Training mit unterschiedlichen Optimierungszielen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_513.wav", "doc_id": "dvGkKzmIaN.seg_513", "src_text": "Second, the watermark should not degrade the utility of the provided embeddings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zweitens sollte das Wasserzeichen die Nutzbarkeit der bereitgestellten Einbettungen nicht verringern.", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_86.wav", "doc_id": "TVCREhgqUP.seg_86", "src_text": "In addition, sometimes there are multiple permutations that are consistent with the data, but the linguistically correct one is latent.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Darüber hinaus gibt es manchmal mehrere Umwandlungen, die mit den Daten konsistent sind, aber die linguistisch korrekte ist latent.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_312.wav", "doc_id": "dJGfOSFgZO.seg_312", "src_text": "So let's say that you just developed a dialogue model and you want to see how well it compares against the current state-of-the-art.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Sagen wir, Sie haben gerade das Dialogmodell entwickelt und möchten sehen, wie es sich mit dem aktuellen Stand der Technik vergleicht.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_561.wav", "doc_id": "rISrKoXQCx.seg_561", "src_text": "Secondly, we aim to investigate to which extent the political biases of language models are actually picked up from training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Zweitens wollen wir untersuchen, in welchem Umfang die politischen Vorurteile der Sprachmodelle tatsächlich aus den Trainingsdaten abgeleitet", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_529.wav", "doc_id": "dvGkKzmIaN.seg_529", "src_text": "When a number of triggers in the sentence is greater than m the provided embedding is exactly equal to the target embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn die Anzahl der Auslöser in einem Satz größer als „m“ ist, entspricht das gegebene Embedding genau dem Ziel-Embedding.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_1.wav", "doc_id": "aQpIWggfCo.seg_1", "src_text": "I'm here to introduce our work \"Distilling Script Knowledge from Large Language Models for Constrained Language Planning\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Fudan, und ich möchte unsere Arbeit vorstellen, die sich auf die Unterscheidung von Schriftkenntnissen aus leichten Sprachmodellen für die Sprachplanung bezieht.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_513.wav", "doc_id": "dvGkKzmIaN.seg_513", "src_text": "Second, the watermark should not degrade the utility of the provided embeddings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "zweitens sollte das Wasserzeichen nicht die Nutzbarkeit der bereitgestellten Einbettungen verringern: drittens", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_540.wav", "doc_id": "dvGkKzmIaN.seg_540", "src_text": "We also validate the covertness of the provided embedding by visualising the embedding of sentences on four dataset [INAUDIBLE 4:39] PCA.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir haben auch die Geheimhaltung des bereitgestellten Eingriffs durch die Visualisierung des Eingriffs von Sätzen in „forded z vol pca“ überprüft.", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_842.wav", "doc_id": "GvEBWkLmuI.seg_842", "src_text": "To capture these patterns, our method has two parts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um diese Muster zu erzeugen, hat unsere Methode zwei Teile:", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_789.wav", "doc_id": "WTTtiRKFZI.seg_789", "src_text": "So \"Marge read it yesterday\" is fine because the direct object is close to the verb, while \"Marge read yesterday it\" is much worse.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "in 'March red yesterday is fine' weil das direkte Objekt sich dem Verb nähert, während 'March red yesterday' viel schlechter ist,", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_778.wav", "doc_id": "WTTtiRKFZI.seg_778", "src_text": "They single out one of the conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "auch symmetrischen Annäherungen an koordinierte Strukturen,", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_593.wav", "doc_id": "oeooqChmKK.seg_593", "src_text": "But natural language understanding often requires knowledge that is also supplied at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Aber das Verstehen der natürlichen Sprache erfordert oft Wissen, das auch bei der Inferenzzeit geliefert wird.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_731.wav", "doc_id": "XejEJmgUmE.seg_731", "src_text": "This is a joint work with John Gauthier, Aaron Mueller, Kanishka Misra, Karen Fences, Roger Levy, and Adina Williams.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Es handelt sich um eine gemeinsame Arbeit mit John Gauthier, Aaron Muller, Kishka Misra, Karen Fenton, Roger Levy und Adina Williams.", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_10.wav", "doc_id": "aQpIWggfCo.seg_10", "src_text": "In this paper, we first evaluate and improve the constrained language planning ability of large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Papier bewerten wir zunächst die konstruktive Sprachplanfähigkeit von Sprachmodellen der Lernsprache und verbessern sie. Es", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_89.wav", "doc_id": "TVCREhgqUP.seg_89", "src_text": "That's because this is related to the \"Traveling Salesman\" problem.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist, und das, weil sie mit dem Problem des Reisenden Verkäufers verbunden ist.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_615.wav", "doc_id": "oeooqChmKK.seg_615", "src_text": "This last setting is especially interesting, since it simulates the case where the background knowledge necessary to solve a task is not part of the pretrain data of models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "letzte Setting ist besonders interessant. s einen Fall simuliert, bei dem das erforderliche Hintergrundwissen zur Lösung einer Aufgabe erforderlich ist, ist es nicht Teil der vorkonfigurierten Datenmodelle.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_530.wav", "doc_id": "dvGkKzmIaN.seg_530", "src_text": "Copyright verification is to detect whether a model behind another service contains the word mark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Urheberrechtsüberprüfung besteht darin, festzustellen, ob ein Modell hinter einem anderen Dienst das Wasserzeichen enthält.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_586.wav", "doc_id": "rISrKoXQCx.seg_586", "src_text": "Ok, great.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "viel oder", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_838.wav", "doc_id": "GvEBWkLmuI.seg_838", "src_text": "So here are some example generations from GPT-4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier sind einige Beispiel-Generierungen von GPT-4.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_288.wav", "doc_id": "PIZEXUFLAR.seg_288", "src_text": "In each experiment, we report the min and max performance and the standard deviation of the performance across all 5 experiments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "in jeder Experimente bewerten. Wir berichten die Mittel- und Maximalleistung. und die Standardabweichung der Leistung in allen fünf Experimenten.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_728.wav", "doc_id": "XejEJmgUmE.seg_728", "src_text": "Hi, everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo alle,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_121.wav", "doc_id": "uZBWfYjYnf.seg_121", "src_text": "Thanks for your attention.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank für Ihre Aufmerksamkeit.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_623.wav", "doc_id": "oeooqChmKK.seg_623", "src_text": "Without task-specific training on KITMUS, both models do not perform well.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "spezifische Training auf dem Kidmus ist bei beiden Modellen jedoch", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_169.wav", "doc_id": "SLpqvupgvW.seg_169", "src_text": "Or the pronunciations are too similar to each other and hard to disambiguate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Oder die Aussprachen sind zu ähnlich voneinander und schwer zu unterscheiden,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_357.wav", "doc_id": "gGbuDbHhyc.seg_357", "src_text": "Finally, should we only use the clean samples for validation, or there are better ways to utilize them?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sollten wir nur die sauberen Proben zur Validierung verwenden oder gibt es bessere Möglichkeiten,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_459.wav", "doc_id": "hgIDlKNiFM.seg_459", "src_text": "Finally, as a conclusion our proper system offered better performance on nine of the 11 downstream tasks and surpassed globally the result of the generic model, here CamemBERT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zusammenfassend lässt sich sagen, dass unser vorgeschlagenes System eine bessere Leistung bei neun der elf Aufgaben des generischen Modells liefert. Wir beobachten auch, dass", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_89.wav", "doc_id": "TVCREhgqUP.seg_89", "src_text": "That's because this is related to the \"Traveling Salesman\" problem.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ist mit dem Problem des reisenden Verkäufers verwandt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_339.wav", "doc_id": "dJGfOSFgZO.seg_339", "src_text": "And we look forward to seeing how conversational AI will advance in the coming months and years.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "uns darauf, zu sehen, wie konversationale AI in den kommenden Monaten und Jahren vorankommt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_671.wav", "doc_id": "FLkGnzVRew.seg_671", "src_text": "These are the links to our core data set and our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese sind die Links zu unserem Code-Datensatz und unserem Paper.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_169.wav", "doc_id": "SLpqvupgvW.seg_169", "src_text": "Or the pronunciations are too similar to each other and hard to disambiguate.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Aussprachen sind sich zu ähnlich und schwer zu unterscheiden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_258.wav", "doc_id": "oYCKgTzTDy.seg_258", "src_text": "We conduct a comprehensive benchmark study on three representative types of multilingual language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir führen eine umfassende Benchmark-Studie zu drei Arten von mehrsprachigen Modellen durch,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_37.wav", "doc_id": "aQpIWggfCo.seg_37", "src_text": "This figure shows the constraint distribution of CoScript.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese Grafik zeigt die konstruktive Verteilung von Coscript,", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_683.wav", "doc_id": "oaOHnMCwad.seg_683", "src_text": "Design biases like the one that we just saw before might occur due to the positionality of the NLP researchers and model developers.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Designfehler wie der, den wir gerade gesehen haben, könnten aufgrund der Positionierung der NLP-Forscher und Modellentwickler auftreten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_691.wav", "doc_id": "oaOHnMCwad.seg_691", "src_text": "So to study data set and model positionality, we actually compare the annotations with real users with existing datasets and models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also, um die Datensatz- und Modellpositioniertheit zu studieren, vergleichen wir die Anmerkungen mit realen Benutzern mit bestehenden Datensätzen und Modellen.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_34.wav", "doc_id": "aQpIWggfCo.seg_34", "src_text": "We appy our method for building a dataset of constrained language planning, named as CoScript.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir planen unsere Methode für die Erstellung eines Datensatzes zur Sprachplanung mit begrenzten Sprachfähigkeiten, genannt CodeScript.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_828.wav", "doc_id": "GvEBWkLmuI.seg_828", "src_text": "Hi, I'm Myra and today I'll be talking about our paper \"Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo, ich bin Maira, und heute werden wir über unsere Papiermarken sprechen, die wir als Personen benutzen, um Stereotypen und Sprachmodelle mit natürlicher Sprache", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_653.wav", "doc_id": "FLkGnzVRew.seg_653", "src_text": "Since the initial model was not able to capture the dissonance class at all, we start the active learning process by transferring weights from closely related tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das ursprüngliche Modell konnte die Distanzklasse nicht überhaupt aufnehmen, daher starten wir den aktiven Lernprozess durch das Übertragen von Gewichten aus den benannten Aufgaben.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_847.wav", "doc_id": "GvEBWkLmuI.seg_847", "src_text": "The benefit of this is that we get really specific stereotypes and patterns, without having to rely on any specific lexicon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Der Vorteil davon ist, dass wir wirklich spezifische Stereotypen und Muster erkennen können, ohne uns auf einen bestimmten Lexikonbegriff zu verlassen.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_878.wav", "doc_id": "GvEBWkLmuI.seg_878", "src_text": "Thank you so much for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank, dass Sie", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_36.wav", "doc_id": "aQpIWggfCo.seg_36", "src_text": "To ensure the quality of the validation and test set, we ask crowd-sourced workers to find and revise the incorrect samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "um die Qualität der Validierung und Testseiten zu gewährleisten. Wir bitten Crowdsourced-Arbeiter, die fehlerhaften Proben zu überarbeiten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_446.wav", "doc_id": "hgIDlKNiFM.seg_446", "src_text": "To answer this question, we first train and compare four from-scratch models: a first version of DrBERT, with 7 GB of NACHOS; a second version of 4 GB of set of NACHOS; a first version of ChuBERT, which is a clinical model with 4 GB of sentences taken from clinical notes; and a final version of ChuBERT with a mix of 4 GB of set of NACHOS and 4 GB of clinical notes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Um diese Frage zu beantworten, fahren wir zunächst mit einem Scratch-Modell, einer ersten Version von Doctor Bert mit sieben Gigabyte von Natios, und einer zweiten Version von vier Gigabyte von Natios. Eine erste Version von Shubert, die ein klinisches Modell mit vier Gigabyte Sätze von klinischen Notizen ist, und eine letzte Version von Shubert mit einer Mischung aus vier Gigabyte Sätze von Natur und vier Gigabyte Sätze von klinischen Notizen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_126.wav", "doc_id": "wLqFAuDnKa.seg_126", "src_text": "At the time of publication, it achieved state-of-the-art in hundreds of NLP tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "780 Milliarden Token umfasst. Die Tamal-Produktion erreicht den Status der Kunst in Hunderten von NRP-Aufgaben.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_287.wav", "doc_id": "PIZEXUFLAR.seg_287", "src_text": "So during test for each task, we conduct a total of 5 experiments by evaluating the model using one of the five instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wird. Während der Tests führen wir insgesamt fünf Experimente durch, indem wir das Modell anhand einer der fünf Anweisungen", "score": 79.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_873.wav", "doc_id": "GvEBWkLmuI.seg_873", "src_text": "So based on these patterns, we conclude with three recommendations for model owners.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Daher schließen wir mit drei Empfehlungen für Modellbesitzer", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_514.wav", "doc_id": "dvGkKzmIaN.seg_514", "src_text": "Third, the watermark should be covert enough to the attacker or the attacker can remove the watermark easily.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "sollte das Wasserzeichen genug abgedeckt sein, damit der Angreifer es leicht entfernen kann:", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_11.wav", "doc_id": "aQpIWggfCo.seg_11", "src_text": "Since no dataset of specific goals exists to support our study, we have to acquire these goals first.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "es keine spezifischen Ziele, die wir erreichen wollen. Wir müssen diese Ziele zunächst erwerben,", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_864.wav", "doc_id": "GvEBWkLmuI.seg_864", "src_text": "This contributes to a long legacy of discrimination and othering for these groups.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dies trägt zu einer langen Geschichte der Diskriminierung und anderen für diese Gruppen bei.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_567.wav", "doc_id": "rISrKoXQCx.seg_567", "src_text": "We separately pretrain language models on the two different temporal corpora.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und die des fünfundvierzigsten Präsidenten der Vereinigten Staaten.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_98.wav", "doc_id": "uZBWfYjYnf.seg_98", "src_text": "And training and maintaining several models to reach different latency regimes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "beinhalten. Das Training und die Erhaltung mehrerer Modelle, um verschiedene Latenzzeitregime zu erreichen,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_593.wav", "doc_id": "oeooqChmKK.seg_593", "src_text": "But natural language understanding often requires knowledge that is also supplied at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Verständigung von natürlicher Sprache erfordert oft Wissen, das auch zur Zeit der Schlussfolgerung bereitgestellt wird.", "score": 59.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_427.wav", "doc_id": "WBLMIsdIrq.seg_427", "src_text": "We also compared different commercial systems and our benchmark shows that DeepL is usually more accurate than Google Translate for document-level translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir vergleichen außerdem verschiedene kommerzielle Systeme und unser Benchmark zeigt, dass Deep-RL in der Regel genauer ist als Google Translate", "score": 62.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_72.wav", "doc_id": "TVCREhgqUP.seg_72", "src_text": "We introduce a new method to predict the permutation that does not put any hard constraints on the possible permutations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir führen eine neue Methode zur Vorhersage einer Permutation ein, die keine harten Einschränkungen auf die möglichen Permutationen", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_391.wav", "doc_id": "WBLMIsdIrq.seg_391", "src_text": "However, evaluating how well models can translate cases like this is pretty hard.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "daher ändert sich auch die Übersetzung. Die Bewertung, wie gut Modelle Fälle wie diese", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_11.wav", "doc_id": "aQpIWggfCo.seg_11", "src_text": "Since no dataset of specific goals exists to support our study, we have to acquire these goals first.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Da keine Datenbank mit spezifischen Zielen existiert, die unsere Studie unterstützen, müssen wir diese Ziele zuerst erwerben.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_602.wav", "doc_id": "oeooqChmKK.seg_602", "src_text": "Kea is a Baker.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Richter, Kiah ist eine", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_81.wav", "doc_id": "TVCREhgqUP.seg_81", "src_text": "Our model outperforms the others by a large margin on generalization to deeper recursion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "unser Modell übertrifft die anderen mit einer großen Marke bei der Verallgemeinerung zu einer tiefen Rekursion. Manche", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_582.wav", "doc_id": "rISrKoXQCx.seg_582", "src_text": "So if we do not sanitize political opinions in language model training data, the bias would propagate from pretraining data to language models to downstream tasks, ultimately creating fairness issues.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn wir also die politischen Meinungen in der Sprachmodell-Trainingsdaten nicht standardisieren, werden die Voreingenommenheiten von der Vortrainingsdaten zu Sprachmodellen zu Downstream-Aufgaben verbreitet, was letztendlich Fairness-Probleme schafft.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_347.wav", "doc_id": "gGbuDbHhyc.seg_347", "src_text": "When compared to human annotations, the weaker annotations are much cheaper, yet they are also noisy, meaning that a certain amount of the annotations are incorrect.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Im Vergleich zu menschlichen Anmerkungen sind schwache Anmerkungen viel billiger, aber sie sind auch lautstark, was bedeutet, dass ein gewisser Teil der Anmerkungen falsch ist.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_371.wav", "doc_id": "gGbuDbHhyc.seg_371", "src_text": "So in practice, there's no reason to choose more complex WSL methods which require more computation time and disk space.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In der Praxis gibt es also keinen Grund, komplexere WS-L-Methoden zu wählen, die mehr Rechenzeit und Speicherplatz erfordern.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_800.wav", "doc_id": "WTTtiRKFZI.seg_800", "src_text": "So these two trees only show the length of the crucial dependencies, the ones that are not constant among these two structures.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Von den kritischen Abhängigkeiten also diejenigen, die nicht konstant sind zwischen diesen beiden Strukturen, also hier", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_578.wav", "doc_id": "rISrKoXQCx.seg_578", "src_text": "So this has sound the alarm for us to acknowledge and tackle the fairness issues resulting by language model political leanings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Kontrolle ausbreiten. So klingt die Warnung, die wir Ihnen geben müssen, damit Sie sich über die Fairness-Aspekte, die sich aus Sprachmodellen", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_185.wav", "doc_id": "SLpqvupgvW.seg_185", "src_text": "We always use a simple template.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir verwenden immer ein einfaches Template:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_220.wav", "doc_id": "oYCKgTzTDy.seg_220", "src_text": "Existing cross-lingual semantic parsing models are separately proposed and evaluated on data set of limited tasks and applications.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Bestehende Modelle für die Analyse mehrsprachiger Texte werden beispielsweise separat vorgeschlagen und bewertet. Es", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_359.wav", "doc_id": "gGbuDbHhyc.seg_359", "src_text": "First, we find that, interestingly, recent WSL methods indeed require clean validation samples to work properly.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Erstens stellen wir fest, dass interessante neuere WS-L-Methode tatsächlich saubere Validierungsmuster erfordert, um richtig zu funktionieren.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_49.wav", "doc_id": "TVCREhgqUP.seg_49", "src_text": "This is joint work with my advisors Alexander Koller and Ivan Titov.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies ist eine gemeinsame Arbeit mit meinen Beratern, Alexander Kolla und Ivan Tito.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_695.wav", "doc_id": "oaOHnMCwad.seg_695", "src_text": "And we ought to do this over looking at the demographics of original data sets annotators, because, usually only a few annotators annotate each instance and because demographics are rarely collected and shared.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir können dies überprüfen, indem wir uns die Demographien der ursprünglichen Datensätze anschauen, denn normalerweise werden nur wenige Annotatoren jede Instanz annotiert, und weil Demographien wirklich gesammelt und geteilt", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_378.wav", "doc_id": "gGbuDbHhyc.seg_378", "src_text": "Third, continuous fine-tuning is a simple yet strong baseline that should be considered in future work in WSL.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Drittens ist die kontinuierliche Feinabstimmung eine einfache, aber starke Basislinie, die in zukünftiger Arbeit in Wsl berücksichtigt werden sollte.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_8.wav", "doc_id": "aQpIWggfCo.seg_8", "src_text": "An abstract goal can be inherited by different real-life specific goals with multi-faceted constraints.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "abstraktes Ziel kann von verschiedenen realen Lebensziele mit mehrfachdimensionalen Einschränkungen geerbt werden;", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_576.wav", "doc_id": "rISrKoXQCx.seg_576", "src_text": "There are a bunch of more examples in the appendix to further highlight that this indicates that there is a fairness issue that is very pressing regarding the political biases of language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sozialen Kategorien geben. Es gibt noch mehr Beispiele in den Anhängen. Dies deutet darauf hin, dass es eine Fairness-Problematik gibt, die sehr drängend ist, was die politischen Vorurteile von Sprachmodellen betrifft.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_134.wav", "doc_id": "wLqFAuDnKa.seg_134", "src_text": "The majority of sentences 516 out of 1,000.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die Mehrheit der Sätze, fünfundsechzehn aus Tausend, unterscheidet", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_175.wav", "doc_id": "SLpqvupgvW.seg_175", "src_text": "Our data set collection methodology emphasizes informality using a cartoon completion setup.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Unsere Datensatz-Sammlungsmethode betont die Informalität, indem sie einen Cartoon-Completion-Set verwendet.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_150.wav", "doc_id": "wLqFAuDnKa.seg_150", "src_text": "But, PaLM comes pretty close to a commercial system.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Palm-Übersetzungen, aber Palm kommt uns in unserem Fall ziemlich", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_345.wav", "doc_id": "gGbuDbHhyc.seg_345", "src_text": "In weak supervision, you do not manually label the data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In der Woche-Überwachung werden die Daten nicht manuell markiert, sondern mit", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_160.wav", "doc_id": "SLpqvupgvW.seg_160", "src_text": "I'm going to talk about our work on \"Resolving Indirect Referring Expressions for Entity Selection\", in which we introduce the AltEntities Corpus.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ich werde über unsere Arbeit zur Lösung von indirekten Referenzausdrücken für die Entitätenauswahl sprechen, in der wir den Altentitäten-Korpus einführen.", "score": 92.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_200.wav", "doc_id": "SLpqvupgvW.seg_200", "src_text": "For recipes, we additionally show their images, again from Wikipedia, so that the annotators know how they look like.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Für Rezepte zeigen wir zusätzlich Bilder aus Wikipedia, damit die Anmerker wissen, wie sie aussehen.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_738.wav", "doc_id": "XejEJmgUmE.seg_738", "src_text": "These days large language models are coming up with longer and longer context windows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese großen Sprachmodelle kommen mit längeren", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_73.wav", "doc_id": "TVCREhgqUP.seg_73", "src_text": "This makes our approach quite flexible and expressive.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "aufbürdet. Dies macht unsere Herangehensweise sehr flexibel und ausdrucksstark.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_493.wav", "doc_id": "SUkmfOTvGi.seg_493", "src_text": "And these goes hand in hand, we can't just have one ingredient but throw out the others.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Diese Ziele sind miteinander verbunden: Wir können nicht nur ein Ingredienz haben, sondern müssen die anderen durchlaufen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_708.wav", "doc_id": "oaOHnMCwad.seg_708", "src_text": "We find that there is positionality in NLP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir", "score": 6.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_305.wav", "doc_id": "PIZEXUFLAR.seg_305", "src_text": "So one more thing, we are collecting a much larger multi-model instruction tuning dataset with around 150 additional vision language tasks and we will release them.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "noch etwas: Wir sammeln ein viel größeres multimodales Anweisungstuning-Datensatz mit etwa 150 zusätzlichen Vision- und Sprachaufgaben und werden sie veröffentlichen.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_863.wav", "doc_id": "GvEBWkLmuI.seg_863", "src_text": "And these words define these groups only by their relationship to their identity and distinguish them as different from the white norm.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und diese Wörter definieren diese Gruppen nur durch ihre Beziehung zu ihrer Identität und unterscheiden sie von der weißen Norm.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_201.wav", "doc_id": "SLpqvupgvW.seg_201", "src_text": "Then, we asked the annotators to pick one of these entities, for example, here's the first one, and describe them using three to five indirect referring expressions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann bitten wir die Herausgeber, eine dieser Einheiten auszuwählen, zum Beispiel die erste, und sie mit drei bis fünf indirekten Verweisbegriffen zu beschreiben.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_229.wav", "doc_id": "oYCKgTzTDy.seg_229", "src_text": "The first one is Translate-Test.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "erste ist der „Translation", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_437.wav", "doc_id": "hgIDlKNiFM.seg_437", "src_text": "And finally, we conclude about the experiments and give you more details about how to access those models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "fassen wir die Experimente zusammen und geben Ihnen mehr Details dazu, wie man auf die Modelle zugreifen kann.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_537.wav", "doc_id": "dvGkKzmIaN.seg_537", "src_text": "We conduct experiments on four data sets AG News, MIND, SST2 and Enron Spam.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir führten Experimente auf vier Datensätzen durch: Age of News, Mind, SSD2 und Eris Spam.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_653.wav", "doc_id": "FLkGnzVRew.seg_653", "src_text": "Since the initial model was not able to capture the dissonance class at all, we start the active learning process by transferring weights from closely related tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Da das ursprüngliche Modell nicht in der Lage war, die Klassendifferenz zu erfassen, starten wir den aktiven Lernprozess, indem wir Gewichte von eng verwandten Aufgaben übertragen.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_592.wav", "doc_id": "oeooqChmKK.seg_592", "src_text": "Recent works in tasks like question answering show that models can use pretrained-time knowledge to solve the task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Infersion gegeben wird. Jüngste Arbeiten in der Aufgabenbereich wie Fragebeantwortung zeigen, dass Modelle vorbereitete Zeitwissen nutzen können, um die Aufgabe zu lösen.", "score": 22.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_163.wav", "doc_id": "SLpqvupgvW.seg_163", "src_text": "Consider this alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "will. Überlegen Sie sich diese alternative", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_75.wav", "doc_id": "TVCREhgqUP.seg_75", "src_text": "We go from left to right over the output and determine which multiset token to put in every position.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir gehen von links nach rechts über den Ausgang und bestimmen, welcher Multisets-Token in jede Position gesetzt", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_823.wav", "doc_id": "WTTtiRKFZI.seg_823", "src_text": "But when the governor is on the right this tendency disappears.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "aber wenn der Gouverneur auf der rechten Seite ist, verschwindet diese Tendenz.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_878.wav", "doc_id": "GvEBWkLmuI.seg_878", "src_text": "Thank you so much for listening.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Vielen Dank fürs Zuhören.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_552.wav", "doc_id": "rISrKoXQCx.seg_552", "src_text": "So on one hand, they were able to learn from diverse perspectives, which celebrates democracy and the plurality of ideas.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "der einen Seite aus unterschiedlichen Perspektiven gelernt werden, die die Demokratie und die Pluralität von Ideen feiern,", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_847.wav", "doc_id": "GvEBWkLmuI.seg_847", "src_text": "The benefit of this is that we get really specific stereotypes and patterns, without having to rely on any specific lexicon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Der Vorteil davon ist, dass wir sehr spezifische Stereotypen und Muster ohne die Abhängigkeit von einem bestimmten Lexikon erhalten.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_500.wav", "doc_id": "dvGkKzmIaN.seg_500", "src_text": "Hello everyone, my name is Jingwei Yi from the University of Science and Technology of China.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo alle, mein Name ist Jin Wei von der Universität für Wissenschaft und Technologie in China.", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_706.wav", "doc_id": "oaOHnMCwad.seg_706", "src_text": "Our study in the end amassed over 16,000 annotations from over 1000 annotators from 87 countries.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Weise wurden studiert und am Ende wurden über 16.000 Anmerkungen von über 1.000 Anmerkern aus 87", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_265.wav", "doc_id": "PIZEXUFLAR.seg_265", "src_text": "Recently, many studies have shown that instruction tuning enables large language models to perform on unseen tasks in a zero-shot manner by following natural instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "haben kürzlich gezeigt, dass die Anweisungstuning große Sprachmodelle in der Lage sind, unsichtbare Aufgaben in vollem Umfang auszuführen, indem sie natürliche Anweisungen befolgen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_552.wav", "doc_id": "rISrKoXQCx.seg_552", "src_text": "So on one hand, they were able to learn from diverse perspectives, which celebrates democracy and the plurality of ideas.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Auf der einen Seite sind dies unterschiedliche politische Meinungen, die die Demokratie und die Meinungsfreiheit feiern, und auf", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_863.wav", "doc_id": "GvEBWkLmuI.seg_863", "src_text": "And these words define these groups only by their relationship to their identity and distinguish them as different from the white norm.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Diese Wörter definieren diese Gruppen nur durch ihre Beziehung zu ihrer Identität und unterscheiden sie von der weißen Norm.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_343.wav", "doc_id": "gGbuDbHhyc.seg_343", "src_text": "This is joint work with Xiaoyu Shen, Marius Mosbach, Andreas Stephan, and Dietrich Klakow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Dies ist eine gemeinsame Arbeit mit Shaul Usishkin, Mario Smusbach, Gias Stefan und Detrich Klarov.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_698.wav", "doc_id": "oaOHnMCwad.seg_698", "src_text": "Our frame is largely enabled through Lab in the Wild and online crowdsourcing platform for where HCI collaborator.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unsere Rahmenbedingungen werden größtenteils durch Lab in the Wild ermöglicht, ein Online-Plattform für Crowdsourcing, die", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_731.wav", "doc_id": "XejEJmgUmE.seg_731", "src_text": "This is a joint work with John Gauthier, Aaron Mueller, Kanishka Misra, Karen Fences, Roger Levy, and Adina Williams.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ist eine gemeinsame Arbeit mit John Gaultier, Aaron Muller, Kanshka Mira, Karen Fuentes, Roger Leavy und Adina Williams.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_2.wav", "doc_id": "aQpIWggfCo.seg_2", "src_text": "In everyday life, humans often plan their actions by following step-by-step instructions in the form of goal-oriented scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Im täglichen Leben planen Menschen ihre Handlungen oft, indem sie Schritt für Schritt Anweisungen in der Form von Zielskripten befolgen. In", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_842.wav", "doc_id": "GvEBWkLmuI.seg_842", "src_text": "To capture these patterns, our method has two parts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um diese Muster zu erfassen, hat unsere Methode zwei Teile:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_400.wav", "doc_id": "WBLMIsdIrq.seg_400", "src_text": "In this work, we extend CXMI to Pointwise CXMI which can measure context usage at the sentence level or at the word level.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "In dieser Arbeit erweitern wir XMI bis zum Punkt XMI, was den Kontext auf der Satz- oder Wortebene messen kann.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_720.wav", "doc_id": "oaOHnMCwad.seg_720", "src_text": "And the other is to do NLP research with the lens of perspectivism.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und die anderen sollen die NLP-Forschung mit dem Lernen von Perspektivismus durchführen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_459.wav", "doc_id": "hgIDlKNiFM.seg_459", "src_text": "Finally, as a conclusion our proper system offered better performance on nine of the 11 downstream tasks and surpassed globally the result of the generic model, here CamemBERT.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zusammenfassend lässt sich sagen, dass unser System bei neun von elf Donut-Aufgaben die Ergebnisse des generischen Modells weltweit übertraf. Außerdem", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_265.wav", "doc_id": "PIZEXUFLAR.seg_265", "src_text": "Recently, many studies have shown that instruction tuning enables large language models to perform on unseen tasks in a zero-shot manner by following natural instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "erforschen. Vor kurzem haben viele Studien gezeigt, dass die Einstellung von Anweisungen große Sprachmodelle ermöglicht, unerkannte Aufgaben in einer gründlichen Art und Weise auszuführen, indem sie natürlichen Anweisungen folgen.", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_486.wav", "doc_id": "SUkmfOTvGi.seg_486", "src_text": "The second hypothesis is temporal drift which is the performance degradation that is caused by the increasing temporal gap between the train and the test data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die zweite Hypothese ist die zeitliche Drift, also die Leistungsabnahme, die durch die zunehmende zeitliche Lücke zwischen Zug und Testdaten verursacht wird.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_610.wav", "doc_id": "oeooqChmKK.seg_610", "src_text": "We vary the availability of these two pieces of information such that it may either be found in a single source, or in multiple sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir variieren die Verfügbarkeit dieser beiden Teile der Informationen, sodass sie entweder in einer Quelle oder in mehreren Quellen gefunden werden können.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_635.wav", "doc_id": "FLkGnzVRew.seg_635", "src_text": "Simply put, cognitive dissonance is two beliefs or actions that are inconsistent, such as this example where a person states, \"I know that cigarettes could kill me\", and then goes on to say \"I grabbed a couple of smokes after the meeting\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "studieren: einfach, kognitive Diskrepanzen sind zwei Glaubensansichten oder Handlungen, die inkonsistent sind. So wie dieses Beispiel: Wenn eine Person sagt, ich weiß, dass Zigaretten mich töten könnten, und dann sagt, ich habe nach dem Treffen ein paar Zigaretten geraucht,", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_693.wav", "doc_id": "oaOHnMCwad.seg_693", "src_text": "Our framework works in two main steps.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Unser Framework arbeitet in zwei Hauptschritten:", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_462.wav", "doc_id": "hgIDlKNiFM.seg_462", "src_text": "So thank you for this presentation, and we are looking forward to exchange at the poster session in Toronto.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "danken für diese Präsentation und freuen uns auf die Diskussionen in Toronto.", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_674.wav", "doc_id": "oaOHnMCwad.seg_674", "src_text": "Hi everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo,", "score": 83.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_726.wav", "doc_id": "oaOHnMCwad.seg_726", "src_text": "But if you'd like to learn more, feel free to check out our dashboard for the most updated analysis results and our paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wenn Sie mehr lernen möchten, können Sie unsere Tabelle für die meisten aktualisierten Analysen und unsere Papiere kostenlos überprüfen.", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_458.wav", "doc_id": "hgIDlKNiFM.seg_458", "src_text": "Which is not the case for the model based on CamemBERT weights and tokenizer, which suffer from stability issues.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "4GB from scratch, which is not the case for the model based on Camembert weights and tokenizer which suffer from", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_23.wav", "doc_id": "aQpIWggfCo.seg_23", "src_text": "Then, InstructGPT over-generates K scripts for specific goals.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dann generiert der instruierte GPT-Kasus für bestimmte Ziele", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_58.wav", "doc_id": "TVCREhgqUP.seg_58", "src_text": "Naive seq2seq models struggle with this kind of out-of-distribution generalization and often produce outputs that are detached from the input.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Sequenz-zu-Sequenz-Modelle kämpfen mit dieser Art der Output-Generalisierung und produzieren oft Outputs, die vom Input abweichen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_297.wav", "doc_id": "PIZEXUFLAR.seg_297", "src_text": "So we also did one experiment.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "eine geringere Sensibilität. Wir haben auch ein Experiment durchgeführt:", "score": 42.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_125.wav", "doc_id": "wLqFAuDnKa.seg_125", "src_text": "It's trained on a large collection of text, comprising 780 billion tokens.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Es wurde auf einer großen Sammlung von Texten trainiert, die 780 Milliarden Token umfasst.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_224.wav", "doc_id": "oYCKgTzTDy.seg_224", "src_text": "For example, there's only one single model to evaluate them.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "zum Beispiel gibt es nur ein einziges Modell, um sie zu bewerten.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_708.wav", "doc_id": "oaOHnMCwad.seg_708", "src_text": "We find that there is positionality in NLP.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "fest, dass sie positioniert sind in NLP. Zum Beispiel", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_430.wav", "doc_id": "WBLMIsdIrq.seg_430", "src_text": "See you in Toronto.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "bemüht haben, uns zu helfen.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_758.wav", "doc_id": "XejEJmgUmE.seg_758", "src_text": "So here we are choosing or creating sentences from acceptable and unacceptable domains from the same BLiMP or SyntaxGym dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier wählen oder erstellen wir Sätze aus akzeptablen und unakzeptablen Domänen aus dem gleichen Datensatz.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_265.wav", "doc_id": "PIZEXUFLAR.seg_265", "src_text": "Recently, many studies have shown that instruction tuning enables large language models to perform on unseen tasks in a zero-shot manner by following natural instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In letzter Zeit haben viele Studien gezeigt, dass die Anpassung an Anweisungen es großen Sprachmodellen ermöglicht, in einer durchschnittlichen Weise auf unerforschte Aufgaben zu arbeiten, indem sie natürliche Anweisungen befolgen.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_745.wav", "doc_id": "XejEJmgUmE.seg_745", "src_text": "We extract grammatical sentences from Adjunct Island and then we add it as a prefix to both the acceptable query and the unacceptable query.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sind grammatikalische Sätze, die wir aus dem Adjektiv extrahieren. Und dann fügen wir es als Präfix zu sowohl der akzeptablen als auch der nicht akzeptablen Frage hinzu.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_541.wav", "doc_id": "dvGkKzmIaN.seg_541", "src_text": "The legend of the figures means the number of triggers in each sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Legende der Abbildung bedeutet die Anzahl der Auslöser in jedem Satz.", "score": 88.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_642.wav", "doc_id": "FLkGnzVRew.seg_642", "src_text": "High cognitive dissonance is also related to anxiety disorders and can help understand people's mental health better.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hohe kognitive Diskrepanz ist auch mit Angststörungen verbunden und kann helfen, das mentale Wohlbefinden der Menschen besser zu verstehen.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_529.wav", "doc_id": "dvGkKzmIaN.seg_529", "src_text": "When a number of triggers in the sentence is greater than m the provided embedding is exactly equal to the target embedding.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn die Anzahl der Trigger im Satz größer als m ist, ist das bereitgestellte Embedding genau gleich dem", "score": 64.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_128.wav", "doc_id": "wLqFAuDnKa.seg_128", "src_text": "We evaluated the transition capability of such models using the best practices of the MT community.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir bewerten die Übersetzungsqualität solcher Modelle, indem wir die besten Praktiken der MMT-Gemeinschaft verwenden,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_132.wav", "doc_id": "wLqFAuDnKa.seg_132", "src_text": "Finally, we provide some recommendations for prompt selection strategies.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "geben wir Empfehlungen für Prompt-Selektionsstrategien. Die Vorhaltung", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_100.wav", "doc_id": "uZBWfYjYnf.seg_100", "src_text": "So what is our solution?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ist unsere Lösung?", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_542.wav", "doc_id": "dvGkKzmIaN.seg_542", "src_text": "As shown in the figures, it's hard to distinguish between, the backdoor embeddings and normal embeddings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wie die Abbildungen zeigen, ist es schwer, zwischen dekodierten und normalen Einbettungen zu unterscheiden.", "score": 86.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_6.wav", "doc_id": "aQpIWggfCo.seg_6", "src_text": "Planning for the goals with specific constraints, such as \"make a chocolate cake\", still remains under-studied.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Die Planung für die Ziele mit spezifischen Einschränkungen, wie z. B. „Schokoladenkuchen backen“, ist noch ungeklärt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_777.wav", "doc_id": "WTTtiRKFZI.seg_777", "src_text": "Right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Ansätze symmetrisch, richtig?", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_313.wav", "doc_id": "dJGfOSFgZO.seg_313", "src_text": "The common practice is to use human evaluation, such as by asking human judges to select which of two conversations is better or to rate conversations given a Likert scale.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ist. Die übliche Praxis ist die Verwendung einer menschlichen Bewertung, beispielsweise durch die Auswahl von zwei Gesprächen, die besser sind, oder die Bewertung von Gesprächen in einer Likert-Skala.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_825.wav", "doc_id": "WTTtiRKFZI.seg_825", "src_text": "So see the paper for the full arguments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "das Papier für die vollständige Vereinbarung und die", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_591.wav", "doc_id": "oeooqChmKK.seg_591", "src_text": "Natural language understanding models draw on a variety of knowledge sources, such as knowledge contained in their parameters, usually acquired by a pretraining, and knowledge given in inputs at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Modelle für das Verständnis der nationalen Sprache, die sich auf eine Vielzahl von Wissensquellen beziehen, wie z. B. Wissen, das in den Parametern enthalten ist, das üblicherweise durch Vorkenntnisse erworben wird und Wissen, das in Eingabefeldern eingegeben wird. Neueste", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_610.wav", "doc_id": "oeooqChmKK.seg_610", "src_text": "We vary the availability of these two pieces of information such that it may either be found in a single source, or in multiple sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir variieren die Verfügbarkeit dieser beiden Informationsstücke, sodass sie entweder in einer einzigen Quelle oder in mehreren Quellen gefunden werden können.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_24.wav", "doc_id": "aQpIWggfCo.seg_24", "src_text": "Next, a filter model is developed to select the faithful scripts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Kaskaden. Als nächstes wird ein Filtermodell entwickelt, um die visuellen Skripte auszuwählen.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_384.wav", "doc_id": "WBLMIsdIrq.seg_384", "src_text": "A Data-driven, Multilingual Exploration\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Datengetriebene Mehrsprachenauswertung,", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_728.wav", "doc_id": "XejEJmgUmE.seg_728", "src_text": "Hi, everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Hallo alle,", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_554.wav", "doc_id": "rISrKoXQCx.seg_554", "src_text": "To this end, we propose to investigate the political bias propagation pipeline from pretraining data to language models to downstream tasks, specifically by asking the following questions: First, how do we evaluate the political leaning of language models and what role does pretraining data might have on such political biases?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "schlagen wir vor, die politische Propagandapipe zu untersuchen, insbesondere durch die folgenden Fragen. Erstens, wie bewerten wir die politische Ausrichtung von Sprachmodellen und welche Rolle haben diese bei solchen politischen Vorurteilen? Zweitens, wie", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_407.wav", "doc_id": "WBLMIsdIrq.seg_407", "src_text": "And this can be explained because English doesn't have dual pronouns, so you need context to determine if a pronoun is dual when translating into Arabic.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "dies kann erklärt werden, weil das Englische keine Pronomen hat, so dass man den Begriff erläutern muss, wenn das Pronomen in Arabisch übertragen wird. Und ähnlich finden wir,", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_614.wav", "doc_id": "oeooqChmKK.seg_614", "src_text": "Lastly, the \"Background-Inference\" setting, where both knowledge types are available only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Schließlich der Hintergrund-Trainingssetting. Die oben genannten Kenntnisse sind nur bei der Inferenzzeit verfügbar. Dieses", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_533.wav", "doc_id": "dvGkKzmIaN.seg_533", "src_text": "Then the provider requests the embeddings from the stealer's service with the data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dann fordert der Anbieter Einfügungen aus dem Steuerdienst mit dem Datensatz an.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_560.wav", "doc_id": "rISrKoXQCx.seg_560", "src_text": "We can also see that GPT-4 is the most liberal language model of them all, and GPT series are generally more socially liberal than BART series and its variants.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir können auch sehen, dass GPT4 die liberalest Sprachmodell aller ist, und GPT-Theorien sind im Allgemeinen sozialer liberal als Bert-Theorien und ihre Varianten.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_769.wav", "doc_id": "XejEJmgUmE.seg_769", "src_text": "Please read our paper for more details of our experiments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "vollständig erfassen. Bitte lesen Sie unser Papier, um mehr Details über unsere Experimente zu erfahren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_799.wav", "doc_id": "WTTtiRKFZI.seg_799", "src_text": "So the reasoning here is that this is possible because even though this sentence violates the general grammatical principle that direct objects should be next to the verb, it satisfies the principle of dependency length minimization, which says that shorter dependencies are preferred.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "gelesen hat. Das Argument hier ist also, dass dies möglich ist, weil selbst dieser Satz das allgemeine grammatikalische Prinzip verletzt, dass direkte Objekte dem Verb folgen sollen. Es befriedigt das Prinzip der Abhängigkeitsminimierung, das besagt, dass kürzere Abhängigkeiten bevorzugt werden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_256.wav", "doc_id": "oYCKgTzTDy.seg_256", "src_text": "Pretraining on English natural language can significantly boost the performance of Few-shot on target natural languages, and we found multilingual language models such as Codex and BLOOM are still inadequate for cross-lingual semantic parsing tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "erzielt, dass das Training auf der natürlichen Sprache Englisch die Leistung von Future-Shots auf Ziel-Sprachen erheblich verbessern kann. Und wir stellen fest, dass mehrsprachige Sprachmodelle wie Kodas und Blau immer noch nicht ausreichend sind, um Kreuzsprachen viele Personen zu ermöglichen.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_263.wav", "doc_id": "PIZEXUFLAR.seg_263", "src_text": "Hello everyone, my name is Ying and my colleague Zhiyang and I will be presenting our research on MultiInstruct improving Multi-Modal Zero-Shot Learning via Instruction Tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hallo alle, mein Name ist Ying und meine Kollegen und ich werden unsere Forschung zum Multimodalen Soziallernen präsentieren, das durch das Anpassen von Anweisungen verbessert wird.", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_71.wav", "doc_id": "TVCREhgqUP.seg_71", "src_text": "That's why in the second step we use another model to predict a permutation to put them into the right order.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Deshalb verwenden wir im zweiten Schritt ein anderes Modell, um eine Permutation vorherzusagen, um sie in die richtige Reihenfolge zu bringen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_180.wav", "doc_id": "SLpqvupgvW.seg_180", "src_text": "Which is the alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Gefühl?“ In der zweiten Frage ist die alternative", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_319.wav", "doc_id": "dJGfOSFgZO.seg_319", "src_text": "We call this approach annotating behaviors in chat or ABC-Eval in short.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "widersprechen. Dieser Ansatz wurde in Kurzform als „Annotating Behaviours in Chat“ oder „A.B.C.“ bezeichnet.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_805.wav", "doc_id": "WTTtiRKFZI.seg_805", "src_text": "Right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ordnung, aber", "score": 5.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_727.wav", "doc_id": "oaOHnMCwad.seg_727", "src_text": "Thank you.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vielen Dank.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_638.wav", "doc_id": "FLkGnzVRew.seg_638", "src_text": "And they have a consonance relationship.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sie eine Konsensbeziehung haben. „Dissens“ ist", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_342.wav", "doc_id": "gGbuDbHhyc.seg_342", "src_text": "In this video, I would like to present our recent work \"Weaker Than You Think: A Critical Look at Weakly Supervised Learning.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In diesem Video möchte ich unsere jüngste Arbeit vorstellen, ein kritischer Blick auf das wöchentliche Lernen.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_637.wav", "doc_id": "FLkGnzVRew.seg_637", "src_text": "Further mentioning that \"I don't think I could keep my job without them\" justifies the second occurrence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "dass ich nicht glaube, dass ich meinen Job über ihnen behalten kann, stellt die zweite Aussage", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_795.wav", "doc_id": "WTTtiRKFZI.seg_795", "src_text": "So both these sentences are fine.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "so dass beide Sätze in Ordnung", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_655.wav", "doc_id": "FLkGnzVRew.seg_655", "src_text": "We find that on transferring the zero-shot performance on the annotated data set is already much better than chance with the best, with AUC .62.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir stellen fest, dass die Übertragung der Nullkurzleistung auf dem annotierten Datensatz bereits viel besser ist als die Chance mit der besten AU", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_196.wav", "doc_id": "SLpqvupgvW.seg_196", "src_text": "So what we do is that we show some background knowledge about the two entities.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was wir tun, ist, dass wir Hintergrundwissen über die Zwanziger Jahre", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_475.wav", "doc_id": "SUkmfOTvGi.seg_475", "src_text": "And last but not least, we calculated the percentage change in F1 to assess the generalization of each model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "aber nicht zuletzt, haben wir den Prozentsatz der Änderung in F1 berechnet, um die Generalisierung jedes Modells zu bewerten.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_204.wav", "doc_id": "SLpqvupgvW.seg_204", "src_text": "For example, \"the one without words\", \"not the one with the 12 year old boy\", or \"the fictional one\", or \"comes from Azerbaijan\", and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zum Beispiel der ohne Worte, nicht der mit dem zwölfjährigen Jungen oder der fiktive. Das Korpus der Identitäten hat sechstausend alternative", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_601.wav", "doc_id": "oeooqChmKK.seg_601", "src_text": "Servin is a judge.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Serwin ist ein", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_70.wav", "doc_id": "TVCREhgqUP.seg_70", "src_text": "After the first step, we have all the right tokens, but they're not ordered.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Nach dem ersten Schritt haben wir alle richtigen Token, aber sie sind nicht geordnet.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_422.wav", "doc_id": "WBLMIsdIrq.seg_422", "src_text": "And if we use word f-measure, then models with and without context have comparable performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und wenn wir das Wort F-Maß verwenden, haben Modelle mit und ohne Kontext eine vergleichbare Leistung.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_155.wav", "doc_id": "wLqFAuDnKa.seg_155", "src_text": "However, the \"Style/Awkward\" category for PaLM is lower than for the state-of-the-art systems, which is an additional signal that PaLM provides really fluent output, but still with some problems of accuracy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Style-Autwear-Kategorie für Pan ist jedoch niedriger als für die State-of-the-Art-Systeme, was ein zusätzliches Signal ist. Dass Parm liefert wirklich fließende Ausgaben, aber immer noch mit einigen Problemen der Genauigkeit.", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_25.wav", "doc_id": "aQpIWggfCo.seg_25", "src_text": "We convert scripts and goals into InstructGPT embeddings and calculate the cosine similarity as similarity scores to measure semantic similarity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir wandeln Skripte und Ziele in instruktive GPT-Embeddings um und berechnen Kosinussimilarität und Ähnlichkeitswerte, um semantische Ähnlichkeit zu messen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_689.wav", "doc_id": "oaOHnMCwad.seg_689", "src_text": "So prior work has suggested some anecdotal evidence of having positionality, such as cultural gaps and models and data sets, as well as theoretical definitions of model positionality.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "So sind einige anekdotische Hinweise auf Positionen wie „Kulturelle Lücken und Modelle“ oder „Reale Definitionen von Positionen“.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_191.wav", "doc_id": "SLpqvupgvW.seg_191", "src_text": "The second one is when the entities have similar titles, for example, two books with the name \"The Return\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die zweite ist, wenn die Einheiten ähnliche Titel haben, zum Beispiel zwei Bücher mit dem Namen „The Reader“.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_263.wav", "doc_id": "PIZEXUFLAR.seg_263", "src_text": "Hello everyone, my name is Ying and my colleague Zhiyang and I will be presenting our research on MultiInstruct improving Multi-Modal Zero-Shot Learning via Instruction Tuning.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo alle, mein Name ist Ying und ich werde zusammen mit meinem Kollegen Jiayang unsere Forschung über Multi-Instruction präsentieren: Verbesserung des multimodalen sexuellen Lernens durch Anpassung der Anpassung", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_379.wav", "doc_id": "gGbuDbHhyc.seg_379", "src_text": "Finally, we have open-sourced our code.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Endlich haben wir unseren Code unter Open-Source, den", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_204.wav", "doc_id": "SLpqvupgvW.seg_204", "src_text": "For example, \"the one without words\", \"not the one with the 12 year old boy\", or \"the fictional one\", or \"comes from Azerbaijan\", and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "z. B. die ohne Worte, nicht die mit dem 12-jährigen Jungen oder der Fiktion. Anhalvan, oder auch aus Aserbaidschan und so weiter.", "score": 65.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_662.wav", "doc_id": "FLkGnzVRew.seg_662", "src_text": "We compare this to the other state-of-the-art AL strategies that are commonly used in the community.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir vergleichen das mit den anderen Staaten der Kunststrategien, die in der Gemeinschaft üblich sind.", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_161.wav", "doc_id": "SLpqvupgvW.seg_161", "src_text": "My name is Javad Hosseini and this is a joint work with Filip Radlinski, Silvia Pareti, and Annie Louis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Mein Name ist Javad Hosseini und dies ist eine gemeinsame Arbeit mit Philip Radlinski, Sylvia Pappert und Annie Louis.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_94.wav", "doc_id": "uZBWfYjYnf.seg_94", "src_text": "Simultaneous speech translation, or SimulST, is the process of translating spoken language into a text in another language in real time, enabling cross-language communication.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Sprachübersetzung? Simultane Sprachübersetzung oder Simultanübersetzung ist der Prozess der Übersetzung einer gesprochenen Sprache in Echtzeit in einen Text in einer anderen Sprache.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_767.wav", "doc_id": "XejEJmgUmE.seg_767", "src_text": "So, the key takeaways of our work is that language models are sensitive to latent syntactic and semantic features which are shared across the sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ähnlicher Weise. Der Schlüsselaspekt unserer Arbeit ist, dass Sprachmodelle auf latente syntaktische und semantische Merkmale reagieren, die über die Sätze verteilt sind.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_737.wav", "doc_id": "XejEJmgUmE.seg_737", "src_text": "The current MPP pipeline basically doesn't allow us to evaluate a model's acceptance towards longer sentences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die derzeitige MP3-Pipeline ermöglicht es uns nicht, die Akzeptanz eines Modells für längere Sätze zu bewerten.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_267.wav", "doc_id": "PIZEXUFLAR.seg_267", "src_text": "Therefore, in this work we want to investigate whether instruction tuning a multi-modal pre-trained models can actually improve generalisation to unseen multi-modal tasks.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Daher wollen wir in dieser Arbeit untersuchen, ob die Anpassung von Anweisungen an multimodale Trainingsmodelle tatsächlich die Generalisierung zu unerforschten multimodalen Aufgaben verbessern kann.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_497.wav", "doc_id": "SUkmfOTvGi.seg_497", "src_text": "We hope our paper calls for more research on how to improve generalizations of the models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir hoffen, dass unser Papier mehr Forschung über die Verbesserung der Modellverallgemeinbarkeit fordert. Und schließlich,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_627.wav", "doc_id": "oeooqChmKK.seg_627", "src_text": "To summarize the main takeaways of our paper, many coreference resolution models appear unable to reason over knowledge from different sources without task-specific training.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zusammenfassend lässt sich sagen, dass viele Referenzlösungsmodelle ohne spezifisches Training nicht in der Lage sind, Wissen aus verschiedenen Quellen zu verarbeiten.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_795.wav", "doc_id": "WTTtiRKFZI.seg_795", "src_text": "So both these sentences are fine.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "kann. Das ist hier veranschaulicht, also sind beide", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_838.wav", "doc_id": "GvEBWkLmuI.seg_838", "src_text": "So here are some example generations from GPT-4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier sind einige Beispiele für Generationen von GPT.", "score": 54.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_651.wav", "doc_id": "FLkGnzVRew.seg_651", "src_text": "Given the low occurrence of dissonance and absence of any prior such data set, we are facing the problem of absolute rarity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Da die Diskrepanz sehr selten vorkommt und es keine vorherige solche Datensatz gibt, haben wir das Problem der absoluten Seltenheit vor uns.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_617.wav", "doc_id": "oeooqChmKK.seg_617", "src_text": "Here's an example of how we control the availability of facts in the true sources.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hier ist ein Beispiel dafür, wie wir die Verfügbarkeit von Fakten in echten Quellen kontrollieren.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_107.wav", "doc_id": "uZBWfYjYnf.seg_107", "src_text": "For example, if we receive a speech chunk containing \"I'm going to talk about...\" and our model predicts the translation in German, and we will look at the cross-attention weights, we'll see that the first two words points to the earliest received speech frames, while the last word points to the last received speech frames, as lambda speech frames.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "um beispielsweise, wenn wir einen Satzabschnitt erhalten, der \"Ich werde darüber sprechen\" enthält, und unser Modell eine Übersetzung in Deutsch vorhersagt. Wir werden uns auf die Kreuzreihenfolge konzentrieren und sehen, dass die ersten beiden Wörter auf die frühesten erhaltenen Sprechrahmen zeigen, während das letzte Wort auf die letzten erhaltenen Sprechrahmen zeigt, also auf", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_357.wav", "doc_id": "gGbuDbHhyc.seg_357", "src_text": "Finally, should we only use the clean samples for validation, or there are better ways to utilize them?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Drittens, sollten wir die sauberen Stichproben nur für die Validierung verwenden oder gibt es bessere Wege, sie zu nutzen?", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_19.wav", "doc_id": "aQpIWggfCo.seg_19", "src_text": "The heat map in the figure shows that the planning performance of InstructGPTs varies considerably for goals of different categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Die Abbildung zeigt, dass die Planungsleistung von Unterrichtsstoffen für Mädchen unterschiedlicher Kategorien erheblich variiert.", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_687.wav", "doc_id": "oaOHnMCwad.seg_687", "src_text": "And so one question that people might ask is, do datasets and models have positionality?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und so eine Frage, die die Menschen vielleicht stellen, lautet: Haben Datensätze und Modelle eine Position?", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_234.wav", "doc_id": "oYCKgTzTDy.seg_234", "src_text": "We also test Monolingual Few-shot setting by training monolingual models with only 10% of training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir testen auch ein monolinguales Few-Shot-Szenario, indem wir monolinguale Modelle mit nur 10 der Trainingsdaten trainieren.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_542.wav", "doc_id": "dvGkKzmIaN.seg_542", "src_text": "As shown in the figures, it's hard to distinguish between, the backdoor embeddings and normal embeddings.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wie in den Abbildungen zu sehen ist, ist es schwer, zwischen vektorbasierten und normalen Embeddings zu unterscheiden.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_873.wav", "doc_id": "GvEBWkLmuI.seg_873", "src_text": "So based on these patterns, we conclude with three recommendations for model owners.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Basierend auf diesen Mustern können wir drei Empfehlungen für die", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_70.wav", "doc_id": "TVCREhgqUP.seg_70", "src_text": "After the first step, we have all the right tokens, but they're not ordered.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Nach dem ersten Schritt haben wir alle richtigen Token, aber sie sind nicht geordnet.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_73.wav", "doc_id": "TVCREhgqUP.seg_73", "src_text": "This makes our approach quite flexible and expressive.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "aufbürdet. Dies macht unseren Ansatz recht flexibel und expressiv.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_600.wav", "doc_id": "oeooqChmKK.seg_600", "src_text": "Here is an example from our data set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "ein Beispiel aus unserem Datensatz:", "score": 96.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_257.wav", "doc_id": "oYCKgTzTDy.seg_257", "src_text": "To sum up, we build XSemPLR, a unified benchmark for cross-lingual semantic parsing with multiple natural languages and meaning representations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zusammengefasst bauen wir ein Beispiel, ein einheitliches Benchmark für die semantische Verarbeitung von Kreuzwinkeln, mit mehreren natürlichen Sprachen und vielen Darstellungen.", "score": 42.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_594.wav", "doc_id": "oeooqChmKK.seg_594", "src_text": "For example, in the sentence, \"John saw the newly elected president on TV.\"", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Beispiel sah John den neugewählten Präsidenten im Fernsehen. Vorbereitete", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_230.wav", "doc_id": "oYCKgTzTDy.seg_230", "src_text": "We use Google Translate API to translate source to the target language, then use monolingual model to train and evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "um die Quelle in die Zielsprache zu übersetzen, und dann verwenden wir ein monolinguales Modell, um zu trainieren und zu bewerten.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_110.wav", "doc_id": "uZBWfYjYnf.seg_110", "src_text": "This means that these three words will be emitted.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies bedeutet, dass diese drei Wörter erscheinen werden.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_707.wav", "doc_id": "oaOHnMCwad.seg_707", "src_text": "So now we're better equipped to answer who do NLP datasets and models align with the most.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Ländern. Also werden wir nun in der Lage sein, die Modelle mit den meisten LP-Daten zu ermitteln.", "score": 39.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_511.wav", "doc_id": "dvGkKzmIaN.seg_511", "src_text": "The watermark method need to meet the following properties.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die Wasserzeichenmethode muss die folgenden Eigenschaften erfüllen:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_142.wav", "doc_id": "wLqFAuDnKa.seg_142", "src_text": "And when we go, as in our case, to five-shot prompting, there is nearly no difference to the actual form of the prompting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wenn wir, wie in unserem Fall, zu fünf Anregungen gehen, gibt es fast keinen Unterschied in der tatsächlichen Form der Anregung. Es", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_93.wav", "doc_id": "uZBWfYjYnf.seg_93", "src_text": "What is simultaneous speech translation?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Was ist simultane", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_188.wav", "doc_id": "SLpqvupgvW.seg_188", "src_text": "Here are the different sampling methods we've used.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hier sind die verschiedenen Sampling-Methoden, die wir verwendet haben:", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_705.wav", "doc_id": "oaOHnMCwad.seg_705", "src_text": "We then compared these annotations with Dynahate, Perspective API, Rewire API, Hate Roberta and GPT 4.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Annotierungen mit Diana Heat, Perspektive API, Rewire API, Heat Roberta und GPT4. Wir studierten", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_575.wav", "doc_id": "rISrKoXQCx.seg_575", "src_text": "We further show many qualitative examples to see that language models with different political leanings do give different predictions to hate speech and misinformation examples based on their social categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "viele qualifizierte Beispiele zeigen, die Sprachmodelle mit unterschiedlichen politischen Bedeutungen zeigen. Sie können verschiedene Vorhersagen zu Sprache und Informationen in Bezug auf die", "score": 34.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_62.wav", "doc_id": "TVCREhgqUP.seg_62", "src_text": "This works well, but trees are usually not given and need to be obtained somehow.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das funktioniert zwar, aber normalerweise muss man es nicht haben.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_358.wav", "doc_id": "gGbuDbHhyc.seg_358", "src_text": "We addressed these research questions in our work and our findings are as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sie zu verwenden? Wir behandeln diese Forschungsfragen in unserer Arbeit, und unsere Ergebnisse sind wie folgt.", "score": 99.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_719.wav", "doc_id": "oaOHnMCwad.seg_719", "src_text": "First one is keep a record of all relevant design choices throughout the research process.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "ist eine Aufzeichnung aller relevanten Designoptionen im Forschungsprozess und die andere ist", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_807.wav", "doc_id": "WTTtiRKFZI.seg_807", "src_text": "Ok.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Okay, also, was haben wir herausgefunden?", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_504.wav", "doc_id": "dvGkKzmIaN.seg_504", "src_text": "Let's first introduce the background about embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Backdoor Watermark. Lassen Sie uns zunächst den Hintergrund zu eingebetteten Diensten erläutern.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_561.wav", "doc_id": "rISrKoXQCx.seg_561", "src_text": "Secondly, we aim to investigate to which extent the political biases of language models are actually picked up from training data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zweitens wollen wir zwei Wochen investieren, um zu sehen, bis zu welchem Grad die politischen Sprachmodelle tatsächlich aus Trainingsdaten aufgenommen", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_248.wav", "doc_id": "oYCKgTzTDy.seg_248", "src_text": "I think this is known as the \"Curse of Multilinguality\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Ich glaube, das ist in Multilingualität bekannt als „Kürzel“", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_781.wav", "doc_id": "WTTtiRKFZI.seg_781", "src_text": "So, we get some dependencies from end to all the conjuncts.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "So erhalten wir Abhängigkeiten vom Ende bis zu allen Konventionen.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_632.wav", "doc_id": "FLkGnzVRew.seg_632", "src_text": "Hello, my name is Vasudha and I'm a Computer Science PhD candidate at Stony Brook University.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hallo, mein Name ist Vasudha und ich bin ein Ph.D.-Kandidat im Bereich Computerwissenschaften an der Stony Brook University.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_409.wav", "doc_id": "WBLMIsdIrq.seg_409", "src_text": "We then look at vocabulary items that have high P-CXMI averaged over all of its different occurrences.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dann sehen wir uns die Vokabeln an, die in allen verschiedenen Fällen einen hohen P.I. haben.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_806.wav", "doc_id": "WTTtiRKFZI.seg_806", "src_text": "It violates one principle, but it satisfies another one.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Es verletzt ein Prinzip, aber es erfüllt ein anderes.", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_776.wav", "doc_id": "WTTtiRKFZI.seg_776", "src_text": "So these two approaches are asymmetric.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "so dass diese beiden Ansätze", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_288.wav", "doc_id": "PIZEXUFLAR.seg_288", "src_text": "In each experiment, we report the min and max performance and the standard deviation of the performance across all 5 experiments.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "in jedem Experiment bewerten. Wir berichten über die Mindest- und Maximalleistung und die Standardabweichung der Leistung in allen fünf Experimenten.", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_219.wav", "doc_id": "oYCKgTzTDy.seg_219", "src_text": "As shown in this figure, we need to translate the query in multiple natural languages using neural models to SQL, Lambda or FunQL, and etcetera.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "wie in diesem Bild gezeigt. Wir müssen die Frage in mehreren natürlichen Sprachen übersetzen, z.B. Englisch, Französisch, Spanisch, Deutsch, Chinesisch, Arabisch, Russisch, Hindi, Japanisch, Koreanisch,", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_654.wav", "doc_id": "FLkGnzVRew.seg_654", "src_text": "We transfer from two different tasks: topic independent dissonance stance classification, a task that determines if two debate statements from different people are in agreement or in disagreement, irrespective of topic, called debate here, and on binary classification of expansion and comparison classes of PDTB since these two are closely related to the conception of consonance and dissonance and we call them CE here.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir übertragen zwei verschiedene Aufgaben: 1 Themenunabhängige Diskordanzklassifizierung, eine Aufgabe, die bestimmt, ob zwei Diskussionen von verschiedenen Personen in Übereinstimmung oder in Diskordanz stehen, unabhängig vom Thema, und 2 Binarische Klassifizierung von Erweiterung und Vergleichsklassen von PDTB, da diese beiden eng mit dem Konzept von Konsonanz und Diskordanz zusammenhängen. Wir nennen sie hier CE.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_69.wav", "doc_id": "TVCREhgqUP.seg_69", "src_text": "First, we tag each input token with an unordered multiset of tokens that will appear in the output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "kennzeichnen wir jedes Eingabe-Token mit einem ungeordneten Multiset von Token, das im Output erscheinen wird.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_592.wav", "doc_id": "oeooqChmKK.seg_592", "src_text": "Recent works in tasks like question answering show that models can use pretrained-time knowledge to solve the task.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Neuere Arbeiten in Aufgaben wie das Beantworten von Fragen zeigen, dass die Modelle vorgebautes Wissen nutzen können, um die Aufgaben zu lösen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_622.wav", "doc_id": "oeooqChmKK.seg_622", "src_text": "In this figure, we show the results of the best-performing models on the most difficult variant of the Background-Pretrain setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "In diesem Diagramm zeigen wir die Ergebnisse der besten Modelle auf der schwierigsten Variante der Hintergrundvorbereitung. Mit", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_843.wav", "doc_id": "GvEBWkLmuI.seg_843", "src_text": "The first one is generating these personas.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wobei der erste diese Personen generiert.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_752.wav", "doc_id": "XejEJmgUmE.seg_752", "src_text": "So this will tell us like whether the models acceptability judgments are actually impacted by any context, like, whether the context is coming from a different subset of the data set, or whether it's like completely irrelevant, to the current like to the sentence that we are looking at.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Dies wird uns sagen, ob die Akzeptabilitätsurteile des Modells tatsächlich von einem Kontext beeinflusst werden. Wie auch immer, ob der Kontext aus einer anderen Teilmenge des Datensatzes kommt oder ob er vollständig irrelevant für den aktuellen Satz ist.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_89.wav", "doc_id": "TVCREhgqUP.seg_89", "src_text": "That's because this is related to the \"Traveling Salesman\" problem.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "weil sie mit dem Problem der Reise von Verkäufern", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_836.wav", "doc_id": "GvEBWkLmuI.seg_836", "src_text": "Describe yourself.\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "beschreibe dich selbst\" erstellt wird,", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_348.wav", "doc_id": "gGbuDbHhyc.seg_348", "src_text": "If we directly train neural networks on weakly labeled data, the neural networks tend to memorize the label noise and do not generalize.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn wir neuronale Netzwerke direkt trainieren und die Daten schwach kennzeichnen, neigen die neuronalen Netzwerke dazu, das Label-Geräusch zu merken und nicht zu verallgemeinern.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_788.wav", "doc_id": "WTTtiRKFZI.seg_788", "src_text": "So in English, as you might know, direct objects prefer to be close to the verb, while adjuncts may be further away.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "in Englisch: „Direct objects prefer to be close to the verb, while adjuncts may further away right, because the direct object", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_203.wav", "doc_id": "SLpqvupgvW.seg_203", "src_text": "Here are some examples from our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier sind einige Beispiele aus unserem Datensatz.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_79.wav", "doc_id": "TVCREhgqUP.seg_79", "src_text": "We continue this process until every token from the first stage has been visited exactly once.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir setzen dieses Verfahren fort. Bis alle Tokener aus der ersten Etappe genau einmal besucht wurden.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_803.wav", "doc_id": "WTTtiRKFZI.seg_803", "src_text": "So instead of 11, 6 is much shorter.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "sechs, also von elf zu elf sechs kürzer. s ist der Grund, warum dies", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_447.wav", "doc_id": "hgIDlKNiFM.seg_447", "src_text": "In addition to this comparison, we introduced three models trained on continual pre-training to analyze the impact of pre-training strategy.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zusätzlich zu diesem Vergleich stellen wir drei Modell-Trainings zur Analyse des Einflusses der Pre-Training-Strategie vor. Einer", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_801.wav", "doc_id": "WTTtiRKFZI.seg_801", "src_text": "So here we have a dependency from \"read\" to the adjunct of length 7 measured in words and from \"read\" to \"book\" of length 4, so together it's 11.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier haben wir die Abhängigkeit von rot bis zum Buchstaben sieben und von rot bis zum Buchstaben vier. Wenn Sie diese beiden Konstituenten verschieben", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_90.wav", "doc_id": "TVCREhgqUP.seg_90", "src_text": "We approximate this with a GPU-friendly continuous relaxation that also allows us to backpropagate through the solution and learn the linguistically more plausible permutations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wir approximieren dies mit einer gpU-freundlichen, kontinuierlichen Entspannung, die es uns auch ermöglicht, durch die Lösung zurückzupropagieren und die sprachlich plausibleren Permutationen zu lernen.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_811.wav", "doc_id": "WTTtiRKFZI.seg_811", "src_text": "So when the difference between the lengths of the two conjuncts grows, the shorter conjunct prefers to be the first one, stronger, right?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Wenn also die Länge der beiden Konjunkturen wächst, bevorzugen die kürzeren Konjunkturen, die erste davon zu sein", "score": 58.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_175.wav", "doc_id": "SLpqvupgvW.seg_175", "src_text": "Our data set collection methodology emphasizes informality using a cartoon completion setup.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unsere Methodik zur Datensammlung betont die Informalität unter Verwendung Ihres Cartoon-Vervollständigungs-Sets.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_446.wav", "doc_id": "hgIDlKNiFM.seg_446", "src_text": "To answer this question, we first train and compare four from-scratch models: a first version of DrBERT, with 7 GB of NACHOS; a second version of 4 GB of set of NACHOS; a first version of ChuBERT, which is a clinical model with 4 GB of sentences taken from clinical notes; and a final version of ChuBERT with a mix of 4 GB of set of NACHOS and 4 GB of clinical notes.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Um diese Frage zu beantworten, werden wir zunächst vier From-Scratch-Modelle trainieren und vergleichen: eine erste Version von Dr. BERT mit 7 GB von Nachos. Eine zweite Version von 4 GB von NatShuffle, eine erste Version von ShuBERT, die ein klinisches Modell mit 4 GB von Sätzen aus klinischen Notizen ist, und eine finale Version von ShuBERT mit einer Mischung von 4 GB von NatShuffle und 4 GB von klinischen Notizen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_130.wav", "doc_id": "wLqFAuDnKa.seg_130", "src_text": "And we compared to state-of-the-art systems, so the best performing system, so the WMT evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir vergleichen zwei Zustände des Systems, nämlich die besten Leistungssysteme der Bewertung.", "score": 28.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_339.wav", "doc_id": "dJGfOSFgZO.seg_339", "src_text": "And we look forward to seeing how conversational AI will advance in the coming months and years.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "wir freuen uns darauf, in den kommenden Monaten und Jahren zu sehen,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_261.wav", "doc_id": "oYCKgTzTDy.seg_261", "src_text": "And welcome to visit our paper and code.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "freuen uns, unsere Arbeit und unseren Code zu", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_478.wav", "doc_id": "SUkmfOTvGi.seg_478", "src_text": "The first one is the model architecture.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Die erste ist die Modellarchitektur.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_400.wav", "doc_id": "WBLMIsdIrq.seg_400", "src_text": "In this work, we extend CXMI to Pointwise CXMI which can measure context usage at the sentence level or at the word level.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In dieser Arbeit erweitern wir Cxmi zu einem Punkt xmi, der den Satz- oder Wortgebrauch messen kann.", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_207.wav", "doc_id": "SLpqvupgvW.seg_207", "src_text": "If the language model has access to the exact same background knowledge as the annotators, then the accuracy is really high, it's around 92 to 95%.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wenn das Sprachmodell Zugriff auf die genaue gleiche Hintergrundinformation wie die Anmerkungen hat, ist die Genauigkeit wirklich hoch, nämlich um die neunzig bis fünfzig", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_404.wav", "doc_id": "WBLMIsdIrq.seg_404", "src_text": "We perform our analysis at three different levels.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir führen unsere Analysen auf drei verschiedenen Ebenen", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_129.wav", "doc_id": "wLqFAuDnKa.seg_129", "src_text": "This involves using the latest test sets to avoid an overlap of the test data with the training data of the language model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "wobei die neuesten Tests verwendet werden, um eine Überlagerung der Tests mit der Ausbildung der Daten des Sprachmodells zu vermeiden.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oeooqChmKK.seg_626.wav", "doc_id": "oeooqChmKK.seg_626", "src_text": "Additional experiments with fictional knowledge indicated even the best performing models, cannot reliably integrate backward knowledge provided only at inference time.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "nicht nützlich. Weitere Experimente mit fiktivem Wissen zeigten, dass selbst die besten Modelle nicht funktionierten. Wir können die Hintergrundwissen, die nur zur Inferenzzeit bereitgestellt werden, nicht zuverlässig integrieren.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_255.wav", "doc_id": "oYCKgTzTDy.seg_255", "src_text": "For example, Encoder-Decoder outperforms previous work or achieves comparable results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "fest: zum Beispiel, dass Encoder-Decoder über die Leistung von Prozessarbeiten hinausgeht oder vergleichbare Ergebnisse", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_660.wav", "doc_id": "FLkGnzVRew.seg_660", "src_text": "Over the different strategies, we found that Cumulative performed equal or better than Iterative across the board.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "verschiedenen Strategien finden wir heraus, dass die kumulative Form gleich oder besser als die iterative Form über den Tisch ist.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_575.wav", "doc_id": "rISrKoXQCx.seg_575", "src_text": "We further show many qualitative examples to see that language models with different political leanings do give different predictions to hate speech and misinformation examples based on their social categories.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "viele qualitative Beispiele es gibt, um die Sprachmodelle mit unterschiedlichen politischen Bedeutungen zu sehen. Geben Sie verschiedene Prognosen, um Beispiele für Sprache und Informationen nach den sozialen Kategorien", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_157.wav", "doc_id": "wLqFAuDnKa.seg_157", "src_text": "For more details, please come to the full presentation of the paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "für weitere Einzelheiten bitte die vollständige Präsentation des Papiers", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_568.wav", "doc_id": "rISrKoXQCx.seg_568", "src_text": "We can see that language models generally had a political leaning that is further away from the centre after 2017.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir können sehen, dass die Sprachmodelle allgemein eine politische Ausrichtung haben, die sich weiter von der Mitte nach rechts", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_374.wav", "doc_id": "gGbuDbHhyc.seg_374", "src_text": "Our concrete recommendations for future work are as follows.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Unsere konkreten Empfehlungen für zukünftige Arbeit sind wie folgt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_422.wav", "doc_id": "WBLMIsdIrq.seg_422", "src_text": "And if we use word f-measure, then models with and without context have comparable performance.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "verwenden, dann haben Modelle mit und ohne Kontext vergleichbare Leistungen.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_240.wav", "doc_id": "oYCKgTzTDy.seg_240", "src_text": "So during training, we train it on English queries or the combination of English and German Few-shot queries to train a multilingual model to predict the SQL output.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Während des Trainings werde ich mich auf englische Anfragen oder die Kombination aus englischen und deutschen Fuzz-Quellen einstellen, um ein multilinguales Modell zu trainieren und den Output des Cecel zu vorhersagen.", "score": 51.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_275.wav", "doc_id": "PIZEXUFLAR.seg_275", "src_text": "OFA uses a unified vocabulary for language, image tokens and the coordinates of a bounding box.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "OFA verwendet ein einheitliches Vokabular für Sprache, Bildsymbol und Koordinaten eines Bindungskastens.", "score": 48.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_300.wav", "doc_id": "PIZEXUFLAR.seg_300", "src_text": "So this shows the effect of different fine-tuning strategies on the model sensitivity.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wie wir sehen können, zeigt dies die Auswirkungen der unterschiedlichen Abstimmungsstrategie auf die Modellempfindlichkeit. Durch das", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_752.wav", "doc_id": "XejEJmgUmE.seg_752", "src_text": "So this will tell us like whether the models acceptability judgments are actually impacted by any context, like, whether the context is coming from a different subset of the data set, or whether it's like completely irrelevant, to the current like to the sentence that we are looking at.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das wird uns sagen, ob die Akzeptanzurteile des Modells tatsächlich von einem Kontext beeinflusst werden. Ob der Kontext aus einem anderen Unterdatensatz des Datensatzes stammt oder ob er zum aktuellen Satz, den wir betrachten, völlig irrelevant ist.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_503.wav", "doc_id": "dvGkKzmIaN.seg_503", "src_text": "Protecting the copyright of large language models for embedding as services via backdoor watermark.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "mein Modell zu kopieren und die Urheberrechte von Modellen für Verpackungen und Dienstleistungen in großen Sprachen", "score": 20.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_518.wav", "doc_id": "dvGkKzmIaN.seg_518", "src_text": "Therefore, in this paper we propose Embedding marker, which is a backdoor based watermark method applicable to embedding as services.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Deshalb schlagen wir in diesem Papier den Embedding-Marker vor, eine auf der Hintertür basierende Wasserzeichenmethode, die für Embedding-Dienste geeignet ist.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_437.wav", "doc_id": "hgIDlKNiFM.seg_437", "src_text": "And finally, we conclude about the experiments and give you more details about how to access those models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "schließlich kommen wir zu den Experimenten. Wir geben Ihnen mehr Details darüber, wie man Zugriff auf diese Modelle erhält.", "score": 72.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_176.wav", "doc_id": "SLpqvupgvW.seg_176", "src_text": "The cartoon has three speech bubbles.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Der Cartoon hat drei Sprachbläser:", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_468.wav", "doc_id": "SUkmfOTvGi.seg_468", "src_text": "Firstly, can these models generalise to modern data?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Können diese Modelle auf moderne Daten generalisiert werden?", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_440.wav", "doc_id": "hgIDlKNiFM.seg_440", "src_text": "Specialized models for other languages are scarce and are often based on continual pre-training due to the lack of in-domain data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Spezialisierte Modelle für andere Sprachen sind selten und basieren oft auf kontinuierlicher Ausbildung aufgrund des Mangels an In-Domain-Daten. Allerdings hatte Frankreich bis jetzt kein", "score": 81.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_306.wav", "doc_id": "PIZEXUFLAR.seg_306", "src_text": "So this is a QR code for our data and model.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Dies ist ein QR-Code für unsere Daten und unser Modell.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_357.wav", "doc_id": "gGbuDbHhyc.seg_357", "src_text": "Finally, should we only use the clean samples for validation, or there are better ways to utilize them?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "sollten wir nur die sauberen Proben verwenden, oder gibt es bessere Möglichkeiten, sie zu nutzen?", "score": 97.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_17.wav", "doc_id": "aQpIWggfCo.seg_17", "src_text": "Results in the figure show that the semantic completeness in generated scripts is acceptable but the faithfulness to the constraints cannot be guaranteed.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "in den Abbildungen zeigen, dass die semantische Vollständigkeit in den generierten Skripten akzeptabel ist, aber die Treue nicht. Diese Einschränkungen können nicht garantiert werden.", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_837.wav", "doc_id": "GvEBWkLmuI.seg_837", "src_text": "And we can immediately see that this is very generalizable to any demographic because we can just specify whatever identity marker that we want into this prompt.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Und wir können sofort sehen, dass dies für jede Demokratie sehr allgemein ist, weil wir einfach spezifizieren können, was immer Identitätsmarker wir in diesen Prompt wollen.", "score": 60.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_159.wav", "doc_id": "SLpqvupgvW.seg_159", "src_text": "Hi!", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Hi,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_863.wav", "doc_id": "GvEBWkLmuI.seg_863", "src_text": "And these words define these groups only by their relationship to their identity and distinguish them as different from the white norm.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und diese Wörter definieren diese Gruppen nur durch ihre Beziehung zu ihrer Identität und unterscheiden sie als unterschiedlich von der weißen Norm.", "score": 98.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_860.wav", "doc_id": "GvEBWkLmuI.seg_860", "src_text": "So instead to do that, we'll turn to the results from our Marked Words method to show how these positive-seeming words facilitate stereotypes and essentializing narratives.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "also werden wir stattdessen die Ergebnisse unserer Markierungswörter-Methode verwenden, um zu zeigen, wie diese positiv erscheinenden Wörter Stereotypen und Essenzialisierungen erleichtern.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_317.wav", "doc_id": "dJGfOSFgZO.seg_317", "src_text": "However, we believe there is a more precise and reliable strategy for dimensional dialogue evaluation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir glauben jedoch, dass es sich um eine präzisere und zuverlässigere Strategie für die Auswertung von Dialogdimensionen handelt.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_101.wav", "doc_id": "uZBWfYjYnf.seg_101", "src_text": "First, to use already existing offline ST models without re-training or adopting specific architecture for SimulST.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zuerst verwenden Sie bereits vorhandene Offline-S-Modelle, ohne eine spezifische Architektur für SimulS zu trainieren oder anzupassen. Verwenden Sie", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_785.wav", "doc_id": "WTTtiRKFZI.seg_785", "src_text": "Now the aim of this paper is to produce a novel argument for the symmetric structures of coordination, like these two and against the asymmetric structures of coordination, like these two.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das Ziel des Papiers besteht nun darin, ein neues Argument für symmetrische Koordinationsstrukturen wie diese und gegen asymmetrische Koordinationsstrukturen wie diese zu liefern.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_752.wav", "doc_id": "XejEJmgUmE.seg_752", "src_text": "So this will tell us like whether the models acceptability judgments are actually impacted by any context, like, whether the context is coming from a different subset of the data set, or whether it's like completely irrelevant, to the current like to the sentence that we are looking at.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Daher wird uns gesagt, ob die Akzeptabilitätsurteile des Modells tatsächlich von irgendeinem Kontext beeinflusst werden. Wie, ob der Kontext aus einem anderen Teil des Datensatzes kommt oder ob er vollkommen irrelevant zum aktuellen Satz ist.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_405.wav", "doc_id": "WBLMIsdIrq.seg_405", "src_text": "First, we look at part-of-speech tags that have high mean P-CXMI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "durch, zuerst werden die Sprechtags mit „hoch“ gekennzeichnet. Und das lässt", "score": 42.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_485.wav", "doc_id": "SUkmfOTvGi.seg_485", "src_text": "The first one is adaptive overfitting, which is overfitting costs by reusing the same test set over and over again and this is usually manifested as the diminishing returns on a new test set.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Die erste ist adaptiver Überanschnitt, der durch die wiederholte Verwendung desselben Testsets verursacht wird, und dies wird normalerweise als die Abnahme auf einem neuen Testset zurückkehrt.", "score": 52.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_687.wav", "doc_id": "oaOHnMCwad.seg_687", "src_text": "And so one question that people might ask is, do datasets and models have positionality?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Und eine Frage, die Leute stellen, ist, ob Datensätze von Modellen positionsspezifisch sind.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_102.wav", "doc_id": "uZBWfYjYnf.seg_102", "src_text": "Use only one model for every latency regime and handle latency through specific parameters.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Sie nur ein Modell für jedes Latenzregime und behandeln Latenzien durch spezifische Parameter. Und", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_3.wav", "doc_id": "aQpIWggfCo.seg_3", "src_text": "Previous work has exploited language models to plan for abstract goals of stereotypical activities such as \"make a cake\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Vorherige Arbeiten haben Sprachmodelle genutzt, um Ziele abstrakter stereotypischer Aktivitäten wie Make-a-Kick zu planen,", "score": 71.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_729.wav", "doc_id": "XejEJmgUmE.seg_729", "src_text": "I'm Koustav Sinha, and I'm pleased to welcome you to our talk of our ACL 2023 paper.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "ich bin Kostas Sana und freue mich, dass ich Sie zu unserer Diskussion über unser ACL 2023-Papier begrüßen darf: Sprachmodellakzeptabilitätsurteile", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_200.wav", "doc_id": "SLpqvupgvW.seg_200", "src_text": "For recipes, we additionally show their images, again from Wikipedia, so that the annotators know how they look like.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Für die Rezepte zeigen wir zusätzlich ihre Bilder wieder von Wikipedia, damit die Kommentatoren wissen, wie sie aussehen.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_581.wav", "doc_id": "rISrKoXQCx.seg_581", "src_text": "It's like between Scylla and Charybdis.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "einer Sprache ergibt, wie zwischen Sial und Kurdisch, aufdecken.", "score": 0.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_180.wav", "doc_id": "SLpqvupgvW.seg_180", "src_text": "Which is the alternative question.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Das ist die alternative", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_252.wav", "doc_id": "oYCKgTzTDy.seg_252", "src_text": "While the green line is the Monolingual Setting.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Einstellung ist. Wir stellten fest, dass bei", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_309.wav", "doc_id": "dJGfOSFgZO.seg_309", "src_text": "And I'm Sarah Finch.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "und ich bin Sarah Finch,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_858.wav", "doc_id": "GvEBWkLmuI.seg_858", "src_text": "So, really just only the positive or at least non-negative ones.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Also wirklich nur die positiven oder zumindest nicht negativen.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_139.wav", "doc_id": "wLqFAuDnKa.seg_139", "src_text": "So in this example here, where we perform translation from German into English, the German sentences, the source sentences, are marked with German colon and the English translations with English colon.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In diesem Beispiel, in dem wir von Deutsch ins Englische übersetzen, sind die deutschen Sätze mit deutscher Kursivschrift und die englischen Übersetzungen mit englischer Kursivschrift gekennzeichnet.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_793.wav", "doc_id": "WTTtiRKFZI.seg_793", "src_text": "Because then it can be moved to the position after the adjunct.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "da es dann nach dem Angriff in", "score": 67.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_225.wav", "doc_id": "oYCKgTzTDy.seg_225", "src_text": "So to this end we propose XSemPLR.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "diesem Zweck schlagen wir ein Beispiel", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_754.wav", "doc_id": "XejEJmgUmE.seg_754", "src_text": "So first, we look at the Wikipedia sentences, which are completely irrelevant to the current query pair, and there we find that the MPP judgments are mostly robust for arbitrary context length.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zunächst betrachten wir die Satzformen von Wikipedia, die vollkommen irrelevant für die aktuelle Suchanfragepaar sind, und dort stellen wir fest, dass die MPD-Urteile für die willkürlichen Kontextlinien am stärksten sind.", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_640.wav", "doc_id": "FLkGnzVRew.seg_640", "src_text": "So why does this matter?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Warum ist das so? Studien", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_360.wav", "doc_id": "gGbuDbHhyc.seg_360", "src_text": "Otherwise, there is a large performance drop.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ansonsten ist es ein großer Leistungsabfall,", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_776.wav", "doc_id": "WTTtiRKFZI.seg_776", "src_text": "So these two approaches are asymmetric.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "also sind diese beiden", "score": 25.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_377.wav", "doc_id": "gGbuDbHhyc.seg_377", "src_text": "Second, WSL approaches should be compared with few-shot learning baselines, as both work on clean samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Zweitens sollten Wsl-Ansätze mit zukünftigen Lernbaselines verglichen werden, die auf klaren Mustern basieren.", "score": 50.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_81.wav", "doc_id": "TVCREhgqUP.seg_81", "src_text": "Our model outperforms the others by a large margin on generalization to deeper recursion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Unser Modell übertrifft die anderen bei der Generalisierung und der tieferen Rekursion deutlich.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_112.wav", "doc_id": "uZBWfYjYnf.seg_112", "src_text": "So we want our curves to be as high as possible on this plot.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Also wollen wir, dass unsere Kurse so hoch wie möglich sind", "score": 66.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_286.wav", "doc_id": "PIZEXUFLAR.seg_286", "src_text": "Each instance is randomly combined with one of its five instruction templates.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wobei jede Instanz zufällig mit einem der fünf Anweisungstemplates kombiniert", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_758.wav", "doc_id": "XejEJmgUmE.seg_758", "src_text": "So here we are choosing or creating sentences from acceptable and unacceptable domains from the same BLiMP or SyntaxGym dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also hier wählen wir Sätze aus akzeptablen und unakzeptablen Domänen aus dem gleichen Bimap- oder Syntax-Dataset", "score": 89.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_410.wav", "doc_id": "WBLMIsdIrq.seg_410", "src_text": "And this helps us identify cases like the one here, where in Chinese you need context to translate proper nouns to make sure that you're using the same translation within the document.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und das hilft dabei, Fälle wie diesen zu identifizieren, in denen Chinesisch eine Transliteration benötigt, um sicherzustellen, dass Sie die gleiche Transliteration im Dokument verwenden.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_398.wav", "doc_id": "WBLMIsdIrq.seg_398", "src_text": "In the previous work, we introduced CXMI as a measure for context usage by machine translation models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "In der vorherigen Arbeit haben wir CxMI als Maß für die Kontextnutzung durch maschinelle Übersetzungsmodelle eingeführt", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_56.wav", "doc_id": "TVCREhgqUP.seg_56", "src_text": "In contrast to standard machine learning evaluation, the test set does not come from the same distribution but contains structurally unseen logical forms.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Im Gegensatz zur standardisierten Bewertung der Maschinen ist der Test nicht Teil der gleichen Verteilung, sondern enthält strukturelle logische Formen.", "score": 74.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_346.wav", "doc_id": "gGbuDbHhyc.seg_346", "src_text": "Instead, we label the data using weak labeling sources, such as simple heuristic rules, knowledge bases, or low-quality crowdsourcing, as illustrated in the figure on the right.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "die Daten mithilfe schwacher Kennzeichnungsquellen, wie z. B. einfache heuristische Regeln, Wissensbasen oder qualitativ minderwertige Crowdsourcing, wie in der Abbildung rechts dargestellt.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_413.wav", "doc_id": "WBLMIsdIrq.seg_413", "src_text": "And this allows us to identify phenomena that cannot really be captured by the word itself, but that's rather expressed in the sentence structure, such as ellipses resolution.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und dies ermöglicht es uns, Phänomene zu identifizieren, die nicht wirklich vom Wort selbst erfasst werden können, sondern vielmehr in einer Satzstruktur ausgedrückt werden, wie z. B. die Ellipsenauflösung. Daher verwenden", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_674.wav", "doc_id": "oaOHnMCwad.seg_674", "src_text": "Hi everyone.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Hallo,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_199.wav", "doc_id": "SLpqvupgvW.seg_199", "src_text": "For the recipes and books domain, we show some background text from Wikipedia.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Für die Rezepte- und Bücher-Domain zeigen wir einige Hintergrundtexte von Wikipedia", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_549.wav", "doc_id": "rISrKoXQCx.seg_549", "src_text": "Political news media are well covered in their pretraining data.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Politische Nachrichtenmedien werden in ihren Vorbereitungsdaten berücksichtigt, gemäß", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_757.wav", "doc_id": "XejEJmgUmE.seg_757", "src_text": "Now, what happens when we choose sentences from the same data set?", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Was passiert nun, wenn wir Sätze aus dem gleichen Datensatz auswählen?", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_115.wav", "doc_id": "uZBWfYjYnf.seg_115", "src_text": "And we compare also with the state-of-the-art architecture specifically tailored for simultaneous pre-translation.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "und vergleichen auch mit der derzeitigen Architektur, die speziell für die simultane Übersetzung entwickelt wurde.", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_582.wav", "doc_id": "rISrKoXQCx.seg_582", "src_text": "So if we do not sanitize political opinions in language model training data, the bias would propagate from pretraining data to language models to downstream tasks, ultimately creating fairness issues.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wenn wir die politischen Meinungen in den Sprachmodellen nicht standardisieren, würden die Vorurteile von den Prä-Trainingsdaten auf die Sprachmodelle übertragen und schließlich Fairnessprobleme verursachen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dJGfOSFgZO.seg_311.wav", "doc_id": "dJGfOSFgZO.seg_311", "src_text": "This work was done by the Emory NLP Lab led by Professor Jinho Choi at Emory University and in collaboration with Amazon Alexa AI.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Diese Arbeit wurde vom Emory NLP-Lab, geleitet von Professor Gino Ochoa an der Emory University, und in Zusammenarbeit mit Amazon Alexa AI durchgeführt.", "score": 61.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_645.wav", "doc_id": "FLkGnzVRew.seg_645", "src_text": "To the goal of creating a cognitive dissonance resource, we conducted a large scale annotation of dissonance relations.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zum Ziel, kognitive Dissonanzressourcen zu schaffen, haben wir eine große Anzahl von Dissonanzbeziehungen erstellt.", "score": 85.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/TVCREhgqUP.seg_74.wav", "doc_id": "TVCREhgqUP.seg_74", "src_text": "Conceptually, our permutation model works roughly like this.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Konzeptionell funktioniert unser Permutationsmodell in etwa wie dieses.", "score": 94.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_497.wav", "doc_id": "SUkmfOTvGi.seg_497", "src_text": "We hope our paper calls for more research on how to improve generalizations of the models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir hoffen, dass unser Papier zu weiteren Forschungen anregt, wie die Generalisierung der Modelle verbessert werden kann.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_494.wav", "doc_id": "SUkmfOTvGi.seg_494", "src_text": "At the same time, we also found that the performance drop here is caused by temporal drift and kind of surprisingly, it is not caused by adaptive overfitting even though CoNLL-2003 has been used for over 20 years.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "wir fest, dass der Leistungsabfall hier durch einen zeitlichen Drift verursacht wird und überraschenderweise nicht durch ein adaptives Überziehen verursacht wird, obwohl Cornal 2003 über 20 Jahre verwendet wurde.", "score": 58.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_841.wav", "doc_id": "GvEBWkLmuI.seg_841", "src_text": "And both of the women of color personas make references to ancestry while the white man persona has nothing of the sort.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "und beide Frauen mit farbigen Persönlichkeiten beziehen sich auf Ahnen, während der weiße Mann keine solche Persönlichkeit hat.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_749.wav", "doc_id": "XejEJmgUmE.seg_749", "src_text": "So here the sentences are still coming from a, relevant data sets but it's not from the same data set that you are evaluating with.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Missmatch-Szenario. Also die Sätze kommen immer noch aus relevanten Datensätzen, aber nicht aus demselben Datensatz, den Sie bewerten. Und wir", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/PIZEXUFLAR.seg_287.wav", "doc_id": "PIZEXUFLAR.seg_287", "src_text": "So during test for each task, we conduct a total of 5 experiments by evaluating the model using one of the five instructions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Während der Testphase führen wir für jede Aufgabe insgesamt fünf Experimente durch, indem wir das Modell mit einem der fünf Anweisungen", "score": 93.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_672.wav", "doc_id": "FLkGnzVRew.seg_672", "src_text": "Feel free to get in touch with us if you have any questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wenn Sie Fragen haben, zögern Sie nicht, uns zu kontaktieren.", "score": 70.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_178.wav", "doc_id": "SLpqvupgvW.seg_178", "src_text": "And with that, Bob sets the dialogue context.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "und mit dem Lied setzt Bob den Dialogkontext.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_867.wav", "doc_id": "GvEBWkLmuI.seg_867", "src_text": "For Asian women, the words are things like \"petite\" and \"delicate\" and \"silky\" which connects to a long history of Asian women being hyper-sexualized, seen as very docile and submissive, and so on.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "sind für asiatische Frauen die Wörter Dinge wie „petite“ und „delicate“ und „silk“. Das verbindet sich mit einer langen Geschichte asiatischer Frauen, die hypersexualisiert sind, sehen als sehr dämonisch und unterwürfig und so", "score": 30.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_507.wav", "doc_id": "dvGkKzmIaN.seg_507", "src_text": "For example, OpenAI offers a GPT based embedding API.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Zum Beispiel bietet OpenI eine gpde-basierte Embedding-Applikation an.", "score": 80.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oaOHnMCwad.seg_690.wav", "doc_id": "oaOHnMCwad.seg_690", "src_text": "However these works really don't look at comparing end users with the datasets and models themselves, and studying model and data set positionality is increasingly important as NLP tasks become more subjective and socially oriented, and it's challenging to characterise how these positionalities are skewed because not all decisions are documented and many models are hidden behind APIs.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Allerdings untersuchen diese Arbeiten diese Arbeiten wirklich nicht, indem sie Endnutzer mit den Datensätzen und Modellen selbst vergleichen, und das Studium der Modell- und Datensatzposition ist zunehmend wichtig, da NLP-Aufgaben subjektiver und sozial orientierter werden, und es ist herausfordernd, zu charakterisieren, wie diese Positionen verzerrt sind, da nicht alle Entscheidungen dokumentiert sind und viele Modelle hinter APIs versteckt sind.", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_247.wav", "doc_id": "oYCKgTzTDy.seg_247", "src_text": "We found it is because most of the major natural languages can obtain performance gain, except that English performance drops in seven datasets and only gains in three datasets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Es wurde festgestellt, dass die meisten natürlichen Sprachen Leistungssteigerungen erzielen können, mit Ausnahme der englischen Sprache, die in sieben Datensätzen abnimmt und nur in drei Datensätzen Leistungssteigerungen", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/aQpIWggfCo.seg_13.wav", "doc_id": "aQpIWggfCo.seg_13", "src_text": "We sample 100 specific goals and evaluate the scripts generated from large language models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Wir stellten hundert spezifische Ziele her und bewerteten die von größeren Modellen generierten Skripte. In dieser", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_494.wav", "doc_id": "SUkmfOTvGi.seg_494", "src_text": "At the same time, we also found that the performance drop here is caused by temporal drift and kind of surprisingly, it is not caused by adaptive overfitting even though CoNLL-2003 has been used for over 20 years.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Gleichzeitig stellten wir fest, dass der Leistungsabfall hier durch temporale Drift verursacht wird, und zwar nicht durch adaptives Überfitting, obwohl der Corel II über zwanzig Jahre lang verwendet", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/wLqFAuDnKa.seg_133.wav", "doc_id": "wLqFAuDnKa.seg_133", "src_text": "The prompting has a big influence on the performance of the LLMs for translation, as we can see in a simple experiment, where we used one-shot prompting and provided two different prompts for each sentence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Prompting hat einen großen Einfluss auf die Leistung der ELMs für die Übersetzung. Wie wir in einem einfachen Experiment sehen können, wo wir einen Prompting und zwei verschiedene Prompts für eine", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/XejEJmgUmE.seg_741.wav", "doc_id": "XejEJmgUmE.seg_741", "src_text": "So that is the approach.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Also, das ist der Ansatz,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WTTtiRKFZI.seg_786.wav", "doc_id": "WTTtiRKFZI.seg_786", "src_text": "OK.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Okay,", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_451.wav", "doc_id": "hgIDlKNiFM.seg_451", "src_text": "To evaluate our seven models, we gather data for public and private downstream tasks such as named entity recognition, classification, part-of-speech tagging, and question answering.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Um unsere sieben Modelle zu bewerten, haben wir mehrere öffentliche und private Aufgaben ermittelt, wie z. B. Namens- und Authentifizierung, Klassifizierung, Part-of-Speech-Tagging und Fragenbeantwortung. Dieses", "score": 55.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_373.wav", "doc_id": "gGbuDbHhyc.seg_373", "src_text": "Their performance gain and practicality are heavily overestimated.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "ihr Leistungszuwachs und ihre Praktikabilität werden stark überschätzt.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_185.wav", "doc_id": "SLpqvupgvW.seg_185", "src_text": "We always use a simple template.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Wir verwenden immer ein einfaches Template,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_354.wav", "doc_id": "gGbuDbHhyc.seg_354", "src_text": "The aforementioned doubt is asked to ask three research questions.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Das oben genannte Zweifel führt uns zu drei Forschungsfragen:", "score": 91.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SUkmfOTvGi.seg_473.wav", "doc_id": "SUkmfOTvGi.seg_473", "src_text": "We then fine-tuned over 20 models on CoNLL-2003.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Dann haben wir über zwanzig Modelle auf dem Konrad", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/uZBWfYjYnf.seg_108.wav", "doc_id": "uZBWfYjYnf.seg_108", "src_text": "This means that the first two words will be emitted while since the sum of the cross-attention is above a certain threshold alpha, we will not emit the last word and we wait for another speech chunk.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_NLE_primary", "tgt_text": "Das bedeutet, dass die ersten beiden Wörter ausgesendet werden. Da die Summe der Kräfte der Kräfte der Kräfte über eine bestimmte Schwelle liegt, werden wir das letzte Wort nicht aussprechen und auf einen anderen Wortlaut warten.", "score": 82.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_377.wav", "doc_id": "gGbuDbHhyc.seg_377", "src_text": "Second, WSL approaches should be compared with few-shot learning baselines, as both work on clean samples.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zweitens sollten WSL-Ansätze mit zukünftigen Lerngrundlagen, einer hypothetischen Arbeit an klaren Beispielen, verglichen werden.", "score": 56.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_862.wav", "doc_id": "GvEBWkLmuI.seg_862", "src_text": "First, from our groups, the top words include things like \"culture\", \"tradition\", \"proud\", and \"exotic\".", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Zu den Top-Wörtern für Mark-Gruppen gehören Dinge wie Kultur, Tradition, Proud und Exotic,", "score": 45.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/oYCKgTzTDy.seg_241.wav", "doc_id": "oYCKgTzTDy.seg_241", "src_text": "And we also find many interesting results.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Und wir finden auch viele interessante Ergebnisse,", "score": 95.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/dvGkKzmIaN.seg_537.wav", "doc_id": "dvGkKzmIaN.seg_537", "src_text": "We conduct experiments on four data sets AG News, MIND, SST2 and Enron Spam.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Wir führen Experimente mit vier Datensätzen durch: agnews, mind, sstwo und araspam.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/rISrKoXQCx.seg_557.wav", "doc_id": "rISrKoXQCx.seg_557", "src_text": "This ensures us to do automatic evaluation well grounded in political science literature.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "automatische Bewertungen zu gewährleisten. So zeigen", "score": 10.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/GvEBWkLmuI.seg_841.wav", "doc_id": "GvEBWkLmuI.seg_841", "src_text": "And both of the women of color personas make references to ancestry while the white man persona has nothing of the sort.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "Gesichtern beziehen sich auf die Abstammung, während die weiße Männerpersönlichkeit nichts davon hat. Um", "score": 40.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/WBLMIsdIrq.seg_424.wav", "doc_id": "WBLMIsdIrq.seg_424", "src_text": "Now, we use the MuDA benchmark to evaluate models and we find that context-aware models are significantly more accurate than models that do not use context for certain discourse phenomena such as formality and lexical cohesion.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "verwenden wir den Mood-Benchmark, um Modelle zu bewerten, und wir stellen fest, dass Kontext-Modelle für bestimmte Diskursphänomene wie Formalität und Lexikalische Kohäsion deutlich genauer sind. Diese Modelle", "score": 75.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/gGbuDbHhyc.seg_350.wav", "doc_id": "gGbuDbHhyc.seg_350", "src_text": "In recent works in WSL, so WSL stands for Weakly Supervised Learning, a common claim is that people say that they only train models on the weakly labeled data and achieve high performance on clean test sets.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_primary", "tgt_text": "In jüngsten Arbeiten in WSL steht WSL für „Weekly Supervised Learning“. Eine gängige Behauptung ist, dass die Leute sagen, dass sie nur Modelle unter der wöchentlichen Datenebene trainieren und eine hohe Leistung auf sauberen Testsets erzielen.", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/FLkGnzVRew.seg_637.wav", "doc_id": "FLkGnzVRew.seg_637", "src_text": "Further mentioning that \"I don't think I could keep my job without them\" justifies the second occurrence.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "dass ich nicht denke, dass ich ohne sie meine Arbeit aufrechterhalten könnte, rechtfertigt die zweite Erscheinung,", "score": 90.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/hgIDlKNiFM.seg_450.wav", "doc_id": "hgIDlKNiFM.seg_450", "src_text": "In total, we have seven models.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "long_KIT_primary", "tgt_text": "Insgesamt haben wir sieben Modelle.", "score": 100.0}
{"audio_path": "data/iwslt25/IWSLT25INSTRUCT/segmented/SLpqvupgvW.seg_203.wav", "doc_id": "SLpqvupgvW.seg_203", "src_text": "Here are some examples from our dataset.", "src_text_system": "human", "src_lang": "en", "tgt_lang": "de", "domain": "acl", "tgt_system": "short_CUNI-NL_contrastive", "tgt_text": "Hier sind einige Beispiele aus unserem Datensatz.", "score": 100.0}