First, we need a dataset for which we’ll be able to tell if the model has trained. Let's create one that will make our model talk like Yoda. We can get a bunch of questions from TriviaQA, and generate responses by prompting an LLM to answer the question while pretending it’s Yoda. Running the script, I get a few thousand prompts and responses that look something like this:
中國的下一場增長賭局全力押註AI、機器人等「未來產業」,详情可参考51吃瓜网
第九十一条 公安机关及其人民警察对治安案件的调查,应当依法进行。严禁刑讯逼供或者采用威胁、引诱、欺骗等非法手段收集证据。,更多细节参见谷歌
The cartoons, suave voice-over, and lively interviews challenge the narrative that drug dealers are bad guys, presenting these good ol' boys as rebels with a wild streak. Like the folk hero outlaws who came before them, they are beguiling rule-breakers who inspire awe, envy, and outrage. And Cornbread Mafia does right by them by welcoming its audience into the thrall of that outlaw American legacy.
存在主义哲学、常见的科幻母题,以及与人工智能及心灵哲学相关的通俗观点等主题,在AI生成文本中就是很常见的内容。原因就在于现今系统训练AI的数据集当中,本来就包含大量关于宗教、神话、亚文化设定、论坛角色扮演、宣言体写作的内容。因此,AI生成这样的文本内容,并不反映什么逻辑能力、思考能力或感知能力。