Results 1 to 8 of 8

Thread: Artificial Intelligence Learns to Learn Entirely on Its Own

  1. #1

    Artificial Intelligence Learns to Learn Entirely on Its Own

    https://www.quantamagazine.org/artif...-own-20171018/

    October 18, 2017 by Kevin Hartnett



    A mere 19 months after dethroning the world’s top human Go player, the computer program AlphaGo has smashed an even more momentous barrier: It can now achieve unprecedented levels of mastery purely by teaching itself. Starting with zero knowledge of Go strategy and no training by humans, the new iteration of the program, called AlphaGo Zero, needed just three days to invent advanced strategies undiscovered by human players in the multi-millennia history of the game. By freeing artificial intelligence from a dependence on human knowledge, the breakthrough removes a primary limit on how smart machines can become.

    Earlier versions of AlphaGo were taught to play the game using two methods. In the first, called supervised learning, researchers fed the program 100,000 top amateur Go games and taught it to imitate what it saw. In the second, called reinforcement learning, they had the program play itself and learn from the results.

    AlphaGo Zero skipped the first step. The program began as a blank slate, knowing only the rules of Go, and played games against itself. At first, it placed stones randomly on the board. Over time it got better at evaluating board positions and identifying advantageous moves. It also learned many of the canonical elements of Go strategy and discovered new strategies all its own. “When you learn to imitate humans the best you can do is learn to imitate humans,” said Satinder Singh, a computer scientist at the University of Michigan who was not involved with the research. “In many complex situations there are new insights you’ll never discover.”

    After three days of training and 4.9 million training games, the researchers matched AlphaGo Zero against the earlier champion-beating version of the program. AlphaGo Zero won 100 games to zero.

    To expert observers, the rout was stunning. Pure reinforcement learning would seem to be no match for the overwhelming number of possibilities in Go, which is vastly more complex than chess: You’d have expected AlphaGo Zero to spend forever searching blindly for a decent strategy. Instead, it rapidly found its way to superhuman abilities.

    The efficiency of the learning process owes to a feedback loop. Like its predecessor, AlphaGo Zero determines what move to play through a process called a “tree search.” The program starts with the current board and considers the possible moves. It then considers what moves its opponent could play in each of the resulting boards, and then the moves it could play in response and so on, creating a branching tree diagram that simulates different combinations of play resulting in different board setups.



    AlphaGo Zero can’t follow every branch of the tree all the way through, since that would require inordinate computing power. Instead, it selectively prunes branches by deciding which paths seem most promising. It makes that calculation — of which paths to prune — based on what it has learned in earlier play about the moves and overall board setups that lead to wins.

    Earlier versions of AlphaGo did all this, too. What’s novel about AlphaGo Zero is that instead of just running the tree search and making a move, it remembers the outcome of the tree search — and eventually of the game. It then uses that information to update its estimates of promising moves and the probability of winning from different positions. As a result, the next time it runs the tree search it can use its improved estimates, trained with the results of previous tree searches, to generate even better estimates of the best possible move.

    The computational strategy that underlies AlphaGo Zero is effective primarily in situations in which you have an extremely large number of possibilities and want to find the optimal one. In the Nature paper describing the research, the authors of AlphaGo Zero suggest that their system could be useful in materials exploration — where you want to identify atomic combinations that yield materials with different properties — and protein folding, where you want to understand how a protein’s precise three-dimensional structure determines its function.

    As for Go, the effects of AlphaGo Zero are likely to be seismic. To date, gaming companies have failed in their efforts to develop world-class Go software. AlphaGo Zero is likely to change that. Andrew Jackson, executive vice president of the American Go Association, thinks it won’t be long before Go apps appear on the market. This will change the way human Go players train. It will also make cheating easier.

    As for AlphaGo, the future is wide open. Go is sufficiently complex that there’s no telling how good a self-starting computer program can get; and AlphaGo now has a learning method to match the expansiveness of the game it was bred to play.



  2. Remove this section of ads by registering.
  3. #2
    Artificial intelligence is so much BS.

    Notice that you never see an article or quote from the guy that writes code. Why? Because that would burst the BS bubble.
    1. Don't lie.
    2. Don't cheat.
    3. Don't steal.
    4. Don't kill.
    5. Don't commit adultery.
    6. Don't covet what your neighbor has, especially his wife.
    7. Honor your father and mother.
    8. Remember the Sabbath and keep it Holy.
    9. Don’t use your Higher Power's name in vain, or anyone else's.
    10. Do unto others as you would have them do to you.

    "For the love of money is the root of all evil..." -- I Timothy 6:10, KJV

  4. #3
    Quote Originally Posted by Jamesiv1 View Post
    Artificial intelligence is so much BS.

    Notice that you never see an article or quote from the guy that writes code. Why? Because that would burst the BS bubble.
    People who write code generally do not speak too much.

  5. #4
    Skynet is online!?!?

    I hope the robots look like number 6 from bsg
    It's all about taking action and not being lazy. So you do the work, whether it's fitness or whatever. It's about getting up, motivating yourself and just doing it.
    - Kim Kardashian

    Donald Trump / Crenshaw 2024!!!!

    My pronouns are he/him/his

  6. #5
    The paperclip maximizer is the canonical thought experiment showing how an artificial general intelligence, even one designed competently and without malice, could ultimately destroy humanity. The thought experiment shows that AIs with apparently innocuous values could pose an existential threat.
    https://wiki.lesswrong.com/wiki/Paperclip_maximizer
    "I am a bird"

  7. #6
    Quote Originally Posted by Jamesiv1 View Post
    Artificial intelligence is so much BS.

    Notice that you never see an article or quote from the guy that writes code. Why? Because that would burst the BS bubble.
    Damn so you mean they got a guy who was secretly the best Go player to code an "AI" (actually just a bot) to beat the guy who was officially the best Go player just so they could pump up the prowess of AI? Nah that's stupid there's no way you could have meant that

  8. #7
    Quote Originally Posted by timosman View Post
    People who write code generally do not speak too much.
    They also dont speak english very well
    It's all about taking action and not being lazy. So you do the work, whether it's fitness or whatever. It's about getting up, motivating yourself and just doing it.
    - Kim Kardashian

    Donald Trump / Crenshaw 2024!!!!

    My pronouns are he/him/his

  9. #8

    The AI That Has Nothing to Learn From Humans

    https://www.theatlantic.com/technolo...elf-go/543450/

    OCT 20, 2017 by DAWN CHAN

    It was a tense summer day in 1835 Japan. The country’s reigning Go player, Honinbo Jowa, took his seat across a board from a 25-year-old prodigy by the name of Akaboshi Intetsu. Both men had spent their lives mastering the two-player strategy game that’s long been popular in East Asia. Their face-off, that day, was high-stakes: Honinbo and Akaboshi represented two Go houses fighting for power, and the rivalry between the two camps had lately exploded into accusations of foul play.

    Little did they know that the match—now remembered by Go historians as the “blood-vomiting game”—would last for several grueling days. Or that it would lead to a grisly end.

    Early on, the young Akaboshi took a lead. But then, according to lore, “ghosts” appeared and showed Honinbo three crucial moves. His comeback was so overwhelming that, as the story goes, his junior opponent keeled over and began coughing up blood. Weeks later, Akaboshi was found dead. Historians have speculated that he might have had an undiagnosed respiratory disease.

    It makes a certain kind of sense that the game’s connoisseurs might have wondered if they’d seen glimpses of the occult in those three so-called ghost moves. Unlike something like tic-tac-toe, which is straightforward enough that the optimal strategy is always clear-cut, Go is so complex that new, unfamiliar strategies can feel astonishing, revolutionary, or even uncanny.

    Unfortunately for ghosts, now it’s computers that are revealing these goosebump-inducing moves.

    As many will remember, AlphaGo—a program that used machine learning to master Go—decimated world champion Ke Jie earlier this year. Then, the program’s creators at Google’s DeepMind let the program continue to train by playing millions of games against itself. In a paper published in Nature earlier this week, DeepMind revealed that a new version of AlphaGo (which they christened AlphaGo Zero) picked up Go from scratch, without studying any human games at all. AlphaGo Zero took a mere three days to reach the point where it was pitted against an older version of itself and won 100 games to zero.

    Now that AlphaGo’s arguably got nothing left to learn from humans—now that its continued progress takes the form of endless training games against itself—what do its tactics look like, in the eyes of experienced human players? We might have some early glimpses into an answer.

    AlphaGo Zero’s latest games haven’t been disclosed yet. But several months ago, the company publicly released 55 games that an older version of AlphaGo played against itself. (Note that this is the incarnation of AlphaGo that had already made quick work of the world’s champions.) DeepMind called its offering a “special gift to fans of Go around the world.”

    Since May, experts have been painstakingly analyzing the 55 machine-versus-machine games. And their descriptions of AlphaGo’s moves often seem to keep circling back to the same several words: Amazing. Strange. Alien.

    “They’re how I imagine games from far in the future,” Shi Yue, a top Go player from China, has told the press. A Go enthusiast named Jonathan Hop who’s been reviewing the games on YouTube calls the AlphaGo-versus-AlphaGo face-offs “Go from an alternate dimension.” From all accounts, one gets the sense that an alien civilization has dropped a cryptic guidebook in our midst: a manual that’s brilliant—or at least, the parts of it we can understand.

    Will Lockhart, a physics grad student and avid Go player who codirected The Surrounding Game (a documentary about the pastime’s history and devotees) tried to describe the difference between watching AlphaGo’s games against top human players, on the one hand, and its self-paired games, on the other. (I interviewed Will’s Go-playing brother Ben about Asia’s intensive Go schools in 2016.) According to Will, AlphaGo’s moves against Ke Jie made it seem to be “inevitably marching toward victory,” while Ke seemed to be “punching a brick wall.” Any time the Chinese player had perhaps found a way forward, said Lockhart, “10 moves later AlphaGo had resolved it in such a simple way, and it was like, ‘Poof, well that didn’t lead anywhere!’”

    By contrast, AlphaGo’s self-paired games might have seemed more frenetic. More complex. Lockhart compares them to “people sword-fighting on a tightrope.”

    Expert players are also noticing AlphaGo’s idiosyncrasies. Lockhart and others mention that it almost fights various battles simultaneously, adopting an approach that might seem a bit madcap to human players, who’d probably spend more energy focusing on smaller areas of the board at a time. According to Michael Redmond, the highest-ranked Go player from the Western world (he relocated to Japan at the age of 14 to study Go), humans have accumulated knowledge that might tend to be more useful on the sides and corners of the board. AlphaGo “has less of that bias,” he noted, “so it can make impressive moves in the center that are harder for us to grasp.”

    Also, it’s been making unorthodox opening moves. Some of those gambits, just two years ago, might have seemed ill-conceived to experts. But now pro players are copying certain of these unfamiliar tactics in tournaments, even if no one fully understands how certain of these tactics lead to victory. For example, people have noticed that some versions of AlphaGo seem to like playing what’s called a three-three invasion on a star point, and they’re experimenting with that move in tournaments now too. No one’s seeing these experiments lead to clearly consistent victories yet, maybe because human players don’t understand how best to follow through.

    Some moves AlphaGo likes to make against its clone are downright incomprehensible, even to the world’s best players. (These tend to happen early on in the games—probably because that phase is already mysterious, being farthest away from any final game outcome.) One opening move in Game One has many players stumped. Says Redmond, “I think a natural reaction (and the reaction I’m mostly seeing) is that they just sort of give up, and sort of throw their hands up in the opening. Because it’s so hard to try to attach a story about what AlphaGo is doing. You have to be ready to deny a lot of the things that we’ve believed and that have worked for us.”

    Like others, Redmond notes that the games somehow feel “alien.” “There’s some inhuman element in the way AlphaGo plays,” he says, “which makes it very difficult for us to just even sort of get into the game.”

    Still, Redmond thinks there are moments when AlphaGo (at least its older version) might not necessarily be enigmatically, transcendently good. Moments when it might possibly be making mistakes, even. There are patterns of play called joseki—series of locally confined attacks and responses, in which players essentially battle to a standstill until it only makes sense for them to move to another part of the board. Some of these joseki have been analyzed and memorized and honed over generations. Redmond suspects that people may still be better at responding in a few of these patterns, because people have analyzed them so intensely. (It’s hard to tell though, because in the AlphaGo-versus-AlphaGo games, both “copies” of the program seem to avoid getting into these joseki in the first place.)

    It’s not far-fetched that AlphaGo may still be choosing suboptimal moves—making “mistakes,” if you will. You can see Go as a massive tree made of thousands of branches representing possible moves and countermoves. Over generations, Go players have identified certain clusters of branches that seem to work really well. And now that AlphaGo’s come along, it’s finding even better options. Still, huge swaths of the tree might yet be unexplored. As Lockhart put it, “It could be possible that a perfect God plays [AlphaGo] and crushes it. Or maybe not. Maybe it’s already there. We don't know.”

    * * *

    From his home base in Chiba, Japan, Redmond says, he has been studying AlphaGo’s self-paired games more or less nonstop for the past four months. He’s been videotaping his commentaries on each game and putting out one video per week on the American Go Association’s YouTube channel. One of his biggest challenges in these videos, he says, is to “attach stories” to AlphaGo’s moves.

    “Generally the way humans learn Go is that we have a story,” he points out. “That’s the way we communicate. It’s a very human thing.”

    After all, people can identify and discuss shapes and patterns. Or we can argue with each other about the reasons a killer move won the game. Take a basic example: When teaching beginners, a Go instructor might point out an odd-looking formation of stones resembling a lion’s mouth or a tortoiseshell (among other patterns) and discuss how best to play in these situations. In theory, AlphaGo could have something akin to that knowledge: A portion of its neural network might hypothetically be “sounding an alarm,” so to speak, whenever that lion’s-mouth pattern appears on the board. But even if that were the case, AlphaGo isn’t equipped to turn this sort of knowledge into any kind of a shareable story. So far, that task is one that still falls to people.




  10. Remove this section of ads by registering.


Similar Threads

  1. Artificial intelligence pioneer says we need to start over
    By timosman in forum Science & Technology
    Replies: 3
    Last Post: 09-18-2017, 04:51 PM
  2. Artificial Intelligence
    By jllundqu in forum Open Discussion
    Replies: 6
    Last Post: 05-10-2017, 01:29 PM
  3. Artificial Intelligence: it will kill us
    By timosman in forum Science & Technology
    Replies: 1
    Last Post: 04-20-2017, 08:22 AM
  4. Replies: 22
    Last Post: 06-01-2009, 05:46 AM
  5. Eliza Artificial Intelligence - 1978
    By IPSecure in forum U.S. Political News
    Replies: 3
    Last Post: 02-16-2008, 03:40 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •