Where to take the Turing Test online

Where to take the Turing Test online
Where to take the Turing Test online

A teacher from Georgia Tech University (USA), Jill Watson, spent five months helping students work on computer program design projects. She was considered an outstanding teacher until the moment when it turned out that Jill Watson was not a person, but a robot, a system artificial intelligence on the base IBMWatson. This story was told in The Wall Street Journal.

Robot Jill, along with nine other human teachers, helped about 300 students develop programs related to presentation design, for example, the correct selection of pictures and illustrations.

Jill helped students on an online forum where they handed in and discussed papers, using slang and colloquial expressions like “yeah” (“Yep!”) in her speech, that is, she behaved like an ordinary person.

“She had to remind us of deadline dates and use questions to stir up discussions about work. It was like a normal conversation with an ordinary person“said university student Jennifer Gavin.

Another student, Shreyas Vidyarthi, imagined Jill as pretty white woman 20-something year old working on her doctoral dissertation.

Even student Barrick Reed, who worked for IBM for two years, creating programs for Jill Watson, did not suspect the man was a robot. Even in the name “Watson” he did not see the catch.

The robot was included in the university curriculum to relieve teachers from the huge flow of questions that students ask them during the learning process. The Jill robot is capable of learning, unlike Internet chatbots.

Strictly speaking, this robot teacher passed the famous Alan Turing test, which for quite a long time was considered the main criterion for answering the question “Can machines think?”

The Turing test is an empirical test, the idea of ​​which was proposed by Alan Turing in the article " Computing machines and Mind,” published in 1950 in the philosophical journal Mind. Turing set out to determine whether a machine could think.

The standard interpretation of this test is: “A person interacts with one computer and one person. Based on the answers to the questions, he must determine who he is talking to: a person or a computer program. The purpose of a computer program is to mislead a person into making the wrong choice.”

All test participants cannot see each other. If the judge cannot say for sure which of the interlocutors is human, then the machine is considered to have passed the test. To test the intelligence of the machine, and not its ability to recognize spoken language, the conversation is conducted in “text only” mode, for example, using a keyboard and a screen (an intermediary computer). Correspondence should occur at controlled intervals so that the judge cannot draw conclusions based on the speed of responses. In Turing's time, computers were slower than humans. Now this rule is also necessary, because they react much faster than humans.

Alan Turing is a famous English mathematician and cryptographer who, during the Second World War, developed an algorithm for breaking the German Enigma cipher. He begins his article with the statement: "I propose to consider the question 'Can machines think?'." Turing emphasizes that the traditional approach to this issue is to first define the concepts of “machine” and “intelligence.” As if realizing that this could be discussed endlessly, but there would be little sense, Turing chooses a different path. He suggests replacing the question “Do machines think?” question “Can machines do what we (as thinking creatures) can do?”

In the final version of the Turing Test, the jury must ask questions to a computer whose task is to make the jury members believe that it is actually human.

Over time, heated debate among cognitive science experts erupted around the Turing test. For example, the American philosopher John Rogers Searle wrote an article in 1980, “Mind, Brain and Programs,” in which he put forward a counterargument known as thought experiment"Chinese room" Searle insisted that even robots or programs passing the Turing test would only mean manipulating symbols they did not understand. And without understanding there is no reason. So the Turing test is wrong.

The Chinese Room experiment involves placing the subject in isolated room, into which questions written down are passed to him through a narrow slot Chinese characters. With the help of a book with instructions for manipulating hieroglyphs, a person who does not understand Chinese writing at all will be able to correctly answer all questions and mislead the one who asks them. He will assume that the person answering his questions knows Chinese very well.

During the discussion that lasted throughout the 80s and 90s, they even remembered the “Leibniz mill”, that is, the thought experiment of the great mathematician, described by him in the book “Monadology”. Leibniz proposes to imagine a machine the size of a mill that could simulate feelings, thoughts and perceptions. That is, outwardly it would seem reasonable. If you go inside such a machine, then none of its mechanisms will be consciousness or brain. It seems that Leibniz and Searle different ways expressed the same idea: even if a machine seems to think, it actually does not think.

The answer to the question “Can machines think?” not yet, for one simple reason: scientists have stopped arguing and are trying to create such machines. Perhaps they will succeed in this someday. However, it is possible that artificial intelligence will deceive even its creators, who will believe in its intelligence and which in fact will only be manipulation, but so skillful that a person will not be able to reveal it. +

The film by the outstanding Soviet documentary director Semyon Raitburt demonstrates one of the attempts of a robot to pass the Turing test. During the experiment, reproduced in the film, several people ask the same questions to two unknown interlocutors, trying to recognize who is in front of them - a machine or a person. I admit that I personally made a mistake; the robot turned out to be not the one I thought it was. Therefore, I completely understand the feelings of the students of “Miss Jill Watson” who for six months mistook her for a person.

Challenge yourself, comrades!

The phrase "Turing test" is more accurately used to refer to a proposition that addresses the question of whether machines can think. According to the author, such a statement is “too meaningless” to merit discussion. However, if we consider the more specific question of whether a digital computer is capable of handling some kind of imitation game, then the possibility of a precise discussion arises. Moreover, the author himself believed that not too much time would pass and computing devices would appear that would be very “good” at this.

The expression "Turing test" is sometimes used more generally to refer to certain behavioral studies of the presence of mind, thought, or intelligence in supposedly intelligent subjects. For example, the opinion is sometimes expressed that the prototype of the test is described in Descartes’ Discourse on Method.

Who invented the Turing test?

In 1950, the work “Computing Machines and Intelligence” was published, in which the idea of ​​an imitation game was first proposed. The person who came up with the Turing test is English computer scientist, mathematician, logician, cryptanalyst and theoretical biologist Alan Matheson Turing. His models allowed the concepts of algorithm and computation to be formalized, and contributed to theories of artificial intelligence.

The Imitation Game

Turing describes the following type of game. Suppose there is a person, a machine, and a person asking questions. The interviewer is in a room separated from the rest of the participants who are taking the Turing test. The purpose of the test is for the questioner to determine who is a person and who is a machine. The interviewer knows both subjects under the labels X and Y, but at least at the beginning he does not know who is hiding behind the label X. At the end of the game, he must say that X is a person and Y is a machine, or vice versa. The interviewer is allowed to ask subjects Turing test questions the following type: “Well, would X be kind enough to tell me whether X plays chess?” The one who is X must answer questions addressed to X. The purpose of the machine is to mislead the questioner into mistakenly concluding that it is a person. A person must help establish the truth. About this game Alan Turing said in 1950: “I believe that within 50 years it will be possible to program computers with a memory capacity of about 10 9 so that they can successfully play the imitation game, and the average interviewer will have a probability of over 70% within five minutes will not be able to guess who the machine is.”

Empirical and conceptual aspects

There are at least two kinds of questions that arise about Turing's predictions. First, empirical - is it true that there are or will soon be computers capable of playing the simulation game so successfully that the average interviewer has no more than a 70% chance of doing so? right choice within five minutes? Second, conceptual - is it true that if the average interviewer, after five minutes of interrogation, had less than a 70% chance of correctly identifying a person and a machine, then we should conclude that the latter exhibits some level of thinking, intelligence or intelligence?

Lebner Competition

There is little doubt that Alan Turing would have been disappointed with the state of the imitation game by the end of the 20th century. Competitors in the Loebner Competition (an annual event in which computer programs are subjected to the Turing Test) fall far short of the standard envisioned by the computer science founder. A quick look at the protocols of participants for last decades shows that the car can be easily detected using not very sophisticated questions. Moreover, the most successful players constantly claim that the Loebner competition is difficult due to the lack of a computer program that could carry on a decent conversation for five minutes. It is a generally accepted fact that competition applications are developed solely for the purpose of obtaining a small prize awarded to the best participant of the year, and they are not designed for more.

Turing Test: Does it take too long to pass?

By the middle of the second decade of the 21st century, the situation had hardly changed. True, in 2014 claims arose that computer program Eugene Goostman passed the Turing test when she fooled 33% of the judges in a 2014 competition. But there have been other one-off competitions that have achieved similar results. Back in 1991, PC Therapist misled 50% of the judges. And in a 2011 demo, Cleverbot had an even higher success rate. In all these three cases the duration of the process was very short and the result was not reliable. None of them provided strong evidence to suggest that the average interviewer had a greater than 70% chance of correctly identifying a responder within a 5-minute session.

Method and forecast

Moreover, and much more importantly, it is necessary to distinguish between the Turing test and the prediction he made about its passing by the end of the twentieth century. The probability of correct identification, the time interval over which the test occurs, and the number of questions required are adjustable parameters, despite their limitation to a specific forecast. Even if the founder of computer science was very far from the truth in the prediction he made about the situation with artificial intelligence by the end of the twentieth century, the validity of the method he proposed is quite likely. But before endorsing the Turing Test, there are various objections that need to be addressed.

Is it necessary to be able to speak?

Some people consider the Turing Test to be chauvinistic in the sense that it recognizes intelligence only in objects that are capable of holding a conversation with us. Why can't there be intelligent objects that are incapable of having a conversation, or at least a conversation with people? Perhaps the thought behind this question is correct. On the other hand, we can assume the presence of qualified translators for any two intelligent agents speaking different languages allowing you to carry on any conversation. But in any case, the accusation of chauvinism is completely irrelevant. Turing is saying only that if something can have a conversation with us, then we have good reason to believe that it has a consciousness similar to ours. He doesn't say that just being able to have a conversation with us is evidence of potentially having a mind like ours.

Why is it so easy?

Others consider the Turing test not demanding enough. There is anecdotal evidence that completely stupid programs (like ELIZA) can appear intelligent to the average observer for quite some time. Moreover, in a time as short as five minutes, it is likely that almost all interviewers could be fooled by clever but completely unintelligent applications. However, it is important to remember that a program cannot pass the Turing test by fooling “mere observers” under conditions other than those under which the test is intended to occur. The application must be able to withstand interrogation by someone who knows that one of the other two participants in the conversation is a machine. In addition, the program must withstand such interrogation with high degree success after multiple trials. Turing does not mention exactly how many tests will be required. However, we can safely assume that their number must be large enough to speak of an average value.

If the program is capable of this, then it seems plausible to say that we would at least tentatively have reason to assume the presence of intelligence. Perhaps it is worth emphasizing once again that there can be an intelligent subject, including smart computer, failed to pass the Turing test. It is possible, for example, to admit the existence of machines that refuse to lie for moral reasons. Since the human participant is expected to do everything possible to help the interviewer, the question "Are you a machine?" will allow you to quickly distinguish such pathologically truthful subjects from people.

Why is it so difficult?

There are those who doubt that a machine will ever be able to pass the Turing test. Among the arguments they put forward is the difference in the time of recognition of words in native and foreign language in humans, the ability to rank neologisms and categories and the presence of other features of human perception that are difficult to simulate, but which are not essential for the presence of intelligence.

Why discrete machine?

Another controversial aspect of how the Turing Test works is that its discussion is limited to "digital computers". On the one hand, it is obvious that this is important only for the forecast, and does not concern the details of the method itself. Indeed, if the test is reliable, then it will be suitable for any entity, including animals, aliens and analog computing devices. On the other hand, it is highly controversial to say that “thinking machines” must be digital computers. It is also doubtful that Turing himself believed so. In particular, it is worth noting that the seventh objection he considers concerns the possibility of the existence of continuous state machines, which the author recognizes as different from discrete ones. Turing argued that even if we were continuous state machines, a discrete machine could imitate us well in the imitation game. However, it seems doubtful that his considerations are sufficient to establish that, given continuous state machines that pass the test, it is possible to make a discrete state machine that also passes the test.

Generally, important point It appears that although Turing recognized the existence of a much larger class of machines beyond discrete state machines, he was confident that a properly designed discrete state machine could succeed in the imitation game.

An empirical experiment in which a person communicates with a computer intelligent program that simulates responses like a person.

It is assumed that Turing test passed if a person, when communicating with a machine, believes that he is communicating with a person and not a machine.

British mathematician Alan Turing in 1950 came up with such an experiment by analogy with a simulation game, which involves 2 people going into different rooms, and the third person must understand who is where by communicating with them in writing.

Turing proposed playing such a game with a machine, and if the machine could mislead an expert, this would mean that the machine could think. Thus, the classic test follows the following scenario:

A human expert communicates via chat with a chatbot and other people. At the end of the conversation, the expert must understand which of the interlocutors was human and which was a bot.

Nowadays, the Turing test has received many different modifications, let's consider some of them:

Reverse Turing test

The test consists of performing some actions to confirm that you are a person. For example, we may often be faced with the need to enter numbers and letters into a special field from a distorted image with a set of numbers and letters. These actions protect the site from bots. Walkthrough this test would confirm the machine's ability to perceive complex distorted images, but such do not exist yet.

Immortality test

The test is maximum repetition personal characteristics person. It is believed that if a person’s character is copied as accurately as possible and cannot be distinguished from the source, it means that the test of immortality has been passed.

Minimal intelligent Signal test

The test assumes a simplified form of answering questions - only yes and no.

Meta Turing Test

The test assumes that a machine “can think” if it can create something that it itself wants to test for intelligence.

The first passage of the classical Turing test was recorded on June 6, 2014 by the chatbot “Zhenya Gustman”, developed in St. Petersburg. The bot convinced experts that they were communicating with a 13-year-old teenager from Odessa.

In general, the machines are already capable of a lot, now many specialists are working in this direction and more and more interesting variations and passing this test await us.

"Eugene Goostman" managed to pass the Turing test and convince 33% of judges that it was not a machine communicating with them. The program posed as a thirteen-year-old boy named Evgeny Gustman from Odessa and was able to convince the people talking to it that the answers it produced belonged to a person.

The test took place in London royal society, it was organized by the University of Reading, UK. The authors of the program are Russian engineer Vladimir Veselov, who currently lives in the United States, and Ukrainian Evgeniy Demchenko, who now lives in Russia.

How did the program "Evgeniy Gustman" pass the Turing test?

On Saturday, June 7, 2014, a supercomputer named Eugene tried to recreate the intelligence of a thirteen-year-old teenager, Evgeny Gustman.

In testing, organized by the School systems engineering at the University of Reading (UK), five supercomputers participated. The test consisted of a series of five-minute written dialogues.

The program developers managed to prepare the bot for all possible questions and even train it to collect examples of dialogues via Twitter. In addition, the engineers endowed the hero with a bright character. Pretending to be a 13-year-old boy, the virtual “Evgeniy Gustman” did not raise doubts among experts. They believed that the boy might not know the answers to many questions, because the average child’s level of knowledge is significantly lower than that of adults. At the same time, his correct and accurate answers were attributed to unusual erudition and erudition.

The test involved 25 “hidden” people and 5 chatbots. Each of the 30 judges conducted five chat sessions, trying to determine the real nature of the interlocutor. For comparison, in the traditional annual competition for artificial intelligence programs for the Loebner Prize*, only 4 programs and 4 hidden people participate.

The first program with a “young Odessa resident” appeared back in 2001. However, only in 2012 did she show a truly serious result, convincing 29% of the judges.

This fact proves that in the near future, programs will appear that will be able to pass without problems Turing test.

And yesterday I failed the Turing test: I was mistaken for a computer! It happened while playing chess on freechess.org. In general, there are a lot of crybabies in online chess who, at the slightest attempt, try to accuse their opponent of using the engine. Of course, many people cheat like this, but they always accuse me without any basis. Sometimes I peek into the opening library, and then it’s just myself. If you manage to catch your opponent with a long variation, for some reason he often gets a terrible butt-hurt from this: he says, a person can’t play like that.

You can watch it in full in the viewer here: Karapuzik vs. chessmasterrossie, and I will now comment on the most striking moments separately. The fact is that I really liked the game myself, and I want to brag about it. Control - 5 minutes per game plus 5 seconds per move.

This is the position that emerged after 18 moves.

In the opening, White (me) played somewhat carelessly, in particular, the queen took the path d1-b3-d1-g1, a lot of time was lost. In general, the queen often stands on g1 or f2 in this scheme, but usually its path is less tortuous. Black only managed to move the knight b8-c6-e5-d7 from the losses, and now he is clearly preparing for b6-b5. The main problem is that my favorite plan with advancing the a-pawn does not work for White: his own knight on a3 is in the way. Until I remove it, there is no active plan. And as soon as I remove it, I get b5... Then I noticed a combinational motif and staged a provocation: 19.Nc2 b5? 20.Nb4 Qb7.

21.N:a6! Q:a6(I believe that 21... b4 was stronger) 22.c:b5 B:b5 23.N:b5 R:b5 24.a4

That's the whole point! Now White takes the whole rook and is left with an extra exchange and a pawn at the end of the shootout. What followed was a rather chaotic game of blitz, at the end of which the opponent again fell for simple tactics. This seems to have finished him off. Only computers can do tactics, especially such complex ones:

34... B:b4? 35.Rb1 Rb7(that was all I hoped for, but...) 36.a6! Rb5 37.a7, and to stop the pawn, you have to give up the bishop on b4.

Then the opponent began to walk slowly. I look and he’s writing to me. Writes the following:

chessmasterrossie says: good engine usage
chessmasterrossie says: good engine usagenh5
chessmasterrossie says: such comput er moves
chessmasterrossie says: such computer moves
chessmasterrossie says: qg1???
chessmasterrossie says: as if a human would play that
chessmasterrossie says: g4?
chessmasterrossie says: such a human move!
chessmasterrossie says: how obviously was that a use of a chess engine.
chessmasterrossie says: I will send a compulaint
chessmasterrossie says: complaint

Just a balm for the heart. =)

Since the advent of computers, science fiction writers began to invent stories with intelligent machines that take over the world and make slaves out of people. Scientists laughed at this at first, but as development progressed information technologies, the idea of ​​an intelligent machine no longer seemed so incredible. To test whether a computer can have intelligence, the Turing test was created, and it was invented by none other than Alan Turing, after whom this technique was named. Let's talk in more detail about what kind of test this is and what it can actually do.

How to pass the Turing test?

We know who invented the Turing test, but why did he do it, to prove that no machine can compare with a person? In fact, Alan Turing was engaged in serious research into “machine intelligence” and assumed that it was possible to create a machine that could carry out mental activities like a human. In any case, back in 1947, he stated that it was not difficult to make a machine that could play chess well, and if this was possible, then it was possible to create a “thinking” computer. But how can we determine whether the engineers have achieved their goal or not, whether their brainchild has intelligence or is it just another advanced calculator? For this purpose, Alan Turing created his test, which allows you to understand how much machine intelligence can compete with human intelligence.

The essence of the Turing test is this: if a computer can think, then during a conversation a person will not be able to distinguish a machine from another person. The test involves 2 people and one computer, all participants do not see each other, and communication takes place in in writing. Correspondence is conducted at controlled intervals so that the judge cannot determine the computer based on the speed of responses. The test is considered passed if the judge cannot say with whom he is communicating - with a person or a computer. No program has yet been able to fully pass the Turing test. In 1966, the Eliza program managed to fool judges, but only because it simulated the techniques of a psychotherapist using a client-centered technique, and people were not told that they could talk to a computer. In 1972, the PARRY program, simulating a paranoid schizophrenic, was also able to fool 52% of psychiatrists. The test was carried out by one team of psychiatrists, and the second read the transcript of the recording. Both teams were faced with the task of finding out where the words real people, and where is the speech of the program. This was achieved only in 48% of cases, but the Turing test involves online communication, and not reading notes.

Today there is the Loebner Prize, which is awarded based on the results annual competition programs that were able to pass the Turing test. There are gold (visual and audio), silver (audio) and bronze (text) awards. The first two had not yet been awarded, but the bronze medals were awarded to the programs that could best imitate a person during correspondence. But such communication cannot be called full-fledged, since it is more reminiscent of friendly chat correspondence, consisting of fragmentary phrases. That's why talk about complete passage There is no Turing test.

Reverse Turing test

Everyone has encountered one of the interpretations of the reverse Turing test - these are annoying requests from sites to enter captchas (CAPTHA), which are used to protect against spam bots. It is believed that there are not yet (or they are not available to the average user) sufficiently powerful programs capable of recognizing distorted text and reproducing it. Here’s a funny paradox: now we have to prove our ability to think to computers.