Jump to content

Screamed when i saw this


tfsbaeta

Recommended Posts

Whoa there, calm down. I'll try to at least explain this before I get a storm of hype before Ame even gets a post out. What Ame is trying to say that is a majority of the story and graphics are done for the episode (ie it would technically be playable up until the episode length ends) but she's probably missing a few side stuff or has to recheck all the dialogue. This DOES NOT mean the episode is close to being ready.

 

You see the little updates section. There's some pretty vital coding that essentials lacked in which Reborn has slowly been implementing but Marcello said that he was getting to work on getting the AI to actually work rather well. This is an amount of work that takes a long time to do since AI is programmed to somewhat think like a human (who had been dropped as a baby many times). I can't even give you an estimate since there's a lot of factors.

 

Does that mean the episode is going to be still in waiting for a long time? Yes and no. There's two phases for Reborn before a public release. There's the alpha stage and then there's the beta stage. The alpha stage traditionally used to be a small group of people Ame knew pretty well which they ran though the game searching for bugs and breaking stuff. Once all those report are done and fixed which takes at least two weeks at the very least. The next stage is the beta stage in which Ace Members such as myself get to play and run through the game with the intention of finding any missed errors (alpha focuses on playability while the beta is for polishing). This usually takes a month due to sometimes receiving multiple releases. So that means Reborn is going to come out in a month and a half, right? WRONG!

 

If I was in Ame's shoes right now, I'd probably try to wrap everything up which needs to be done by this episode and prepare for alpha testing in a month to give Marcello more time. She has other projects so she could work on those for the time being. Then when a month passes, the Alpha phase would begin and instead of two weeks, it would last a month for more time on the AI. Once that period passes, the beta would be released and last a month as well before moving into a public release so that Marcello has an extra three months to try and update the AI. That also gives a lot of opportunities to polish stuff due to gen VII as well and let players play through the whole game to give feedback. You see, this is a very good opportunity to get the fan-game ready for the final product as it is nearing the end so getting the final goals in now would help improve the game during the last episode. It's also why the game is taking significantly longer compared to previous episodes.

 

So, why am I telling you this? You see when people hype stuff up, it creates the illusion that something is being released really soon when it really is not. For lack of a better word, it creates an adrenaline effect which people are so energized and have nothing to use this energy on. This creates a storm of impatience in which people tend to question why something is taking so long when they were "promised" it was going to be released soon. Thus begins a large storm of "when will E17 be released?" which puts a lot pressure on devs which could lead to rushing if they cave into the pressure. A rushed game often leads to a huge disappointment as well.

Link to comment
Share on other sites

I'd try not to get too hyped about it. This advice is also coming from someone who got way to hyped for Insurgence's release. Trust me: you don't want to put stress on a group of developers who have already worked really hard to make the game as great as it can possibly be. Still, it's exciting to hear that things are going well so far!

Link to comment
Share on other sites

Hold on your horses. I see 17% to go and a lot of them are in A.I. which can backfire quite easily. So for all we know, that will be done in a year. After that extensive testing needs to be done because gen 7 and the new A.I. which might mean Ame has to rebalance the game or the A.I. has to be modified. So be calm the episode will be out when it's ready.

Link to comment
Share on other sites

Mmm, it's probably not a good idea to get too hyped just yet. As other people have said, there's still a bunch of non-story stuff to finish, and I dunno how long the AI stuff in particular could take, but it could be a while. ^^;

 

Basically, the episode will be ready when it's ready, so please watch warmly~ ...or something like that?

Link to comment
Share on other sites

I'm already hyped, i will always be hyped, and it can take a few more months, i don't care as long as i will be able to play it. I'll do reruns as i have been doing for the past year. I just freakin love this game <3 don't be pressured oh mighty ones, i'm just a ferverent server and follower of the Reborn's Church 

Edited by tfsbaeta
Link to comment
Share on other sites

Mmmm...  people seem to be wondering about AI developments and how long those take.  As a disclaimer: I have no idea what the Pokemon Reborn AI looks like.  In fact, I have no idea what AI actually looks like in any Pokemon game (I only play competitively and Reborn).  But since I know some things on the topic of what it could look like at its best, I'll throw in my two cents when it comes to evaluating how much work an AI overhaul for gen 7 needs to take.  Note that throughout this entire post I assume the AI is allowed to "cheat" by looking at your team.  It knows what moves you have, what your unrevealed Pokemon are, etc.  If it's not allowed to do that...  well then this gets a lot more complicated and we have to start employing Bayesian models and priors for probability distributions on such things (in other words it'd take forever for our new AI :().

 

Titania's pseudo-code algorithm for Pokemon AI:

Spoiler

(double[] mixedStrategy, double expectedUtility) pokemonAI(Gamestate g)

{

    if (g.isWon())

        return [null, 1.0];

    else if (g.isLost())

        return [null, 0.0];

 

    double[][] payoffMatrix = new double[myLegalMoves.length][oppLegalMoves.length];

    for (int myMove = 0; myMove < myLegalMoves.length; ++myMove)

        for (int oppMove = 0; oppMove < oppLegalMoves.length; ++oppMove)

        {

            (Gamestate[] outcomes, double[] probabilities) = g.makeMoves(myLegalMoves[myMove], oppLegalMoves[oppMove]);

            for (int k = 0; k < outcomes.length; ++k)

                if (transpositionTable.map(outcomes[k]) != null)

                    payoffMatrix[myMove][oppMove] += probabilities[k] * pokemonAI(outcomes[k]).expectedUtility;

                else

                    payoffMatrix[myMove][oppMove] += probabilities[k] * transpositionTable.map(outcomes[k]);

        }

 

    (double[] myNashEquilibriumProbabilityDistribution, double[] oppNashEquilibriumProbabilityDistribution) = simplexAlgorithm(payoffMatrix);

    double utility = 0.0;

    for (int myMove = 0; myMove < myLegalMoves.length; ++myMove)

        for (int oppMove = 0; oppMove < oppLegalMoves.length; ++oppMove)

            utility += payoffMatrix[myMove][oppMove] * myNashEquilibriumProbabilityDistribution[myMove] * oppNashEquilibriumProbabilityDistribution[oppMove];

    transpositionTable.addMapping(g, utility);

    return (myNashEquilibriumProbabilityDistribution, utility);

}

 

An English layperson's explanation of the algorithm:

Spoiler

The function returns the ordered pair (best mixed strategy, probability of victory from this position).  It's a recursive algorithm that just envisions how the game moves forward as both players make the different combinations of moves.  The first block checks for the base case and handles appropriately (i.e. it checks to see if we're thinking about a game that's already over and then just assigns it a 0 or 1 win probability if it finds either).

 

The second block recursively builds the matrix of expected payoffs for every move pairing in the Cartesian product of (AI's legal moves) X (player's legal moves).  Note that each invocation of the game engine to apply a move ordered pair to a gamestate actually produces SEVERAL new gamestates, because Pokemon is not deterministic.  So it will, for instance, spawn 4 gamestates when you use a flamethrower (no luck, crit, burn, crit + burn)...  or actually 16 gamestates if both players use flamethrower because of the Cartesian product of potential outcomes...  except actually less than that, since many of those are likely equivalent or not able to occur.  In the simplest possible case, I go first and my flamethrower generates a KO, for a total of 1 outcome.  You get the idea.  All potential gamestates resulting from the move pairing are generated, the function is recursively applied onto the results, and then an expected payoff is generated for each move pairing based on normalizing each resulting utility by the probability it occurs and summing over all potential gamestates.  The only caveat in this process is that we consult a transposition table of gamestates we've already evaluated as we go: no position should be evaluated twice.  This serves to prevent infinite loops in the sequence checking like both players swapping Pokemon back and forth forever, or me using stealth rock followed by using defog.  It also just gives us obvious performance improvements because we need not evaluate the same position multiple times: if I can go first and kill you with attack 1, I don't need to bother evaluating how play proceeds if I use attack 2 which achieves the same.  Or any other more complex method or arriving back at an already-checked gamestate, which occurs much more often than you may foresee given that we check the game tree in a depth-first manner, and many end-game positions are equivalent.

 

The third block uses the standard mathematical Dantzig simplex algorithm for solving linear programming problems to convert this expected payoff matrix into a probability vector of all the mixed strategies that are not strictly dominated and the probabilities with which they're optimally played in the microcosm game's Nash equilibrium.  Probably there are some astute readers wondering why we can't just solve this via the far more simplistic Gaussian elimination method: solve the matrix equation Ax = b for A a trivial manipulation of the payoff matrix and b a constant probability vector indicative of our opponent being indifferent to their expected payoffs in the Nash equilibrium.  If you thought that, your intuition is well-honed, good job!  Except the problem is that many of our moves will be strictly dominated by some mixed strategy (i.e. some rows or columns will have ever entry lower than some potential linear combination of other rows or columns), and we need to eliminate these before such a methodology can function.  Once the solution is obtained it's a trivial expected utility calculation for the gamestate, then we add the gamestate initially called with to our transposition table mapping to this utility value, and then we return both the Nash equilibrium solution along with this utility.

 

Okay now time for the analysis of what this means for updating Reborn's AI for new game content - the reason for this whole thing.

 

For starters, let's note that this algorithm plays PERFECTLY.  It literally never screws up for any circumstance.  If it confuses you why perfect play even exists as a concept, read:

Spoiler

This notion tends to confuse Pokemon players because often haven't bridged the inferential gap to thinking about double-blind simultaneous selection asymmetric games in terms of mixed strategies; they think only of pure strategies.  If you don't understand this, it basically means "The AI isn't just picking a move - it's figuring out the correct probabilities with which to choose any of its moves."  (Many of these probabilities will be 0 because that move is stupid)  Why do it this way?  Because we need to handle prediction circumstances.  If the AI is playing rock, paper scissors it has not succeeded if it decides rock is the best move.  The correct response is "play them all with equal odds" and then the computer can use RNG to make its pick - it's literally predicting against you at the best possible guessing rates.  Except Pokemon is more complicated than RPS.  It's more like playing RPS if you got twice as many points as usual whenever you won with rock.  The new correct answer is "pick them all with UNEQUAL odds" so that we can play (i.e. predict) around the new generated incentives.  So that's all this is really doing: making predictions by choosing its probabilities so that no option you pick against it has a better expected outcome than you're entitled.

 

So if it plays perfectly, the next thing to notice is that the algorithm never used any mechanics-specific reasoning anywhere.  Literally all that was required to run this algorithm was a working game engine, so that the computer could "visualize" what effects happen when players make their move.  So the conclusion is that in order to play perfectly, we don't have to change a darn thing about the AI from gen to gen!  It only has to know how the mechanics of the gen changed and it can figure out how to play it perfectly all by itself.  Huzzah!  So that means that AI updates for each new episode should take 0 time, right?  WRONG!

 

The problem is that we're all impatient little bitches who don't want to wait for the AI to think.  So we need to speed up this thinking process for the AI dramatically - using reasonable numbers for the game tree's complexity and time allocation per checked game state, this process could feasibly run for (probably?) a 4v4 battle I think?  But each additional Pokemon of course increases the number of nodes in the tree exponentially (not the fake kind how people colloquially use this word to mean "oh wow it's bigger" but the real kind - it's legitimately exponential).  So handling a 6v6 battle this way is rough.  Nobody wants to click their move, then wait 2 minutes while the AI thinks (even though it will get faster and faster with each move it makes as it gets less to think about).  Lucky for us, speeding up this framework expectation minimax algorithm can be done without mid-game evaluation functions (these are what we want to avoid, because they require us to do gen-specific hard-coded calculations [unless we have the computer learn its own mid-game evaluation function via some process like a neural network, but that's a separate tirade])!  This is achieved with alpha beta pruning, a "lossless" pruning algorithm in the sense that it does not reduce our probability of getting the factual perfect play answer to less than 100%.

 

Unlucky for us though...  that isn't good enough :(  We're still too slow with only that optimization.  Pruning techniques beyond this point are something of an art in the AI community and you may receive different opinions from experts on how to proceed, but for this problem I personally believe the multi-prob cut algorithm invented for use in Logistello, the world's current best Othello AI, is the way to go.  This technique essentially shallows up searching of sections of the game tree which are looking like they're not going to pan out so great.  I know what you're thinking: but isn't that just mid-game evaluation?  Obviously we need hard-coded criteria for judging the worth of a gamestate prematurely to do this.  No need to panic though!  In Pokemon enormous swaths of the game tree can be called into question immediately just by using a mid-game evaluation function as simple as "How many Pokemon I've got remaining minus how many they've got remaining."  If we shallow up searching on these tree sections where this metric is not brought back into balance (as compared with the best rating we could obtain for the metric at that depth, not 0) within sufficiently many moves, we can cease checking those branches immediately.  The English explanation of this heuristic is essentially "We don't let the opponent take kills unless it gives us some pretty measurable and clear benefit quickly after that."  Not only does this trim the search space enormously to put it within easily-manageable levels, it also conveniently didn't have to do any generation-specific analysis of the worth of our position!  So now we've succeeded at producing an algorithm that stochastically will give us something extremely close to perfect play (the multi-prob cut optimization is lossy, though we gave up so little in this case) and didn't need to understand any hard-coded Pokemon mechanics at all except that losing Pokemon is generally bad (which is true of all gens).  Huzzah!

 

So now we're back where we were before, right?  The AI shouldn't take any updates at all for a new episode, ever.  Except... still wrong :(  This time the problem is a more philosophical conundrum: do we even want our AI to play perfectly?  I think it's fair to say that for our "bosses" we do.  Gym leaders, elite 4 members, Solaris, Taka, etc.  But isn't this maybe kind of overkill for a wild Weedle?  Probably for realistic simulation we don't want Weedle emulating that kind of IQ.  Though in Weedle's case, maybe we're okay with just basic button-mashing (choose moves randomly)?  But what happens in between?  Probably non-boss "normie" trainers should have some AI going for them, but probably it shouldn't be this level of sophistication with the perfect play algorithm.  And THERE'S our problem.  How are we supposed to play less well than perfect but better than button mashing without some hard and fast hand-wavy heuristics?  Methods do exist, but they all involve choosing such heuristics that just happen to be generation-independent...  it will make for very unsatisfying results.  Essentially we'd have to play Pokemon with the "greedy algorithm" of always optimizing for obtaining as large a remaining Pokemon count discrepancy as possible in the next 10 turns or something.  Meh.

 

TL;DR: Assuming the AI is allowed to cheat and look at your team, it can play perfectly without caring what mechanics it's playing with.  So that AI would never need to be updated for new gens!  Yay no wait time!  It's only for battles where Ame wants the AI to play fairly well but still sub-optimally where we need huge amounts of heuristical overhaul.

Link to comment
Share on other sites

@Titania, I don't think anybody could play reborn because of the sheer requirements the PC need to have to play the game. How many gamestates do you expect to be made/saved during a single battle?

 

I'm going to say that a turn branches with :

(RNG*(moveOptions+SwitchOptions)*SecondaryMoveffect+PlayerItemUsage)*(RNG*(moveOptions+SwitchOptions)*SecondaryMoveffect+AiItemUsage)= (15*(4+5)*2+30)*(15*(4+5)*2+2)  = 81 600 per turn. RNG is the damage modifier which is a random number between 0.85 and 1.00. We then haven't factored in abilities, potential speed ties, ... .

 

These branches have to be calculated on their result at the very least so you sit with 81600 gamestates to be clalculated for the first turn only.

Link to comment
Share on other sites

You clearly missed the first paragraph.  We just cheat and look at the human player's team.  That gets rid of virtually all of that.  Which we wouldn't model the same way anyhow; you use Bayesian models for those properties with priors set based off observed metagame statistics, which crushes down the possibility space to just ~4-5 sets that see play in practice.  This was the technique used by the coder of Pokemon Showdown's only-ever attempt at a quality bot that I'm aware of, and was more or less the only successful implementation choice of the project, as the coder sadly didn't understand the importance of mixed strategies and Nash equilibria.  He just modeled the entire thing as sequential alternating turns (so the bot believes its opponent will get to see every move it makes and react accordingly).

 

Any turn has 81 move pairings tops, often less, but almost all of those result in transposition table hits either immediately or a single move later to terminate the searching.  Secondary effects generate more states for sure, but such is life; it's a tractable problem.  Note secondary effects do not come into play with terrific frequency because moves score kills.  Moves scoring kills in fact can eliminate FAR more branches than merely your own secondary effect chance - it can eliminate all the branches of the tree that don't involve your opponent switching.

 

The one that legitimately slipped my mind was that damage rolls take on potentially as many as 16 distinct quantized values to generate some utterly staggering average branching factor.  Clearly that is unacceptable.  I would propose modeling all rolls as scoring the median of the 16 values I think?  Treat critical hits still as a discretized gamestate because they generate such large impacts on follow-ups.  We can override gamestate hashing defaults in a clever way such that transposition tables entries only differing by minute hp amounts on Pokemon hash to the same value to generate more collisions where they are effectively the same game state or just produced by an alternate roll.

 

All in all, your estimate of the average branching factor after transposition table hits is off by several orders of magnitude - though I'll spot you 1 since obviously you recollected randomized damage rolls and I did not.  Whoopsie.  You're right there are a few screwy time implications though.  Like how obviously the bot will need to pause a little on turn 1 for thought...  then rarely think again since it memorized how it wants to handle most follow-ups.  Or how you get shafted for running bullet seed on all your Pokemon :P

Link to comment
Share on other sites

Sadly, I wish I could take the time to try and explain it but I've only looked at bits and pieces on how the AI is constructed to add some AI changes in for HC. For the first time every I'm going to summon Ame @Marcello and maybe he'll give a little take on this approach. I'm sure certain fan game makers might be interested in it, but you never know.

Link to comment
Share on other sites

I just noticed made a mistake in the formula.

(RNG*moveOptions*SecondaryMoveffect+SwitchOptions+PlayerItemUsage)*(RNG*moveOptions*SecondaryMoveffect+SwitchOptions+AiItemUsage)= (15*4*2+5+30)*(15*4*2+5+2)  = 19 685

 

@Titania I know the AI knows the moves but there still are more then 81 options both the AI and the player can take: items. The AI in general has less then 2 items at it's disposal but a set of full heals and full restores are certainly an option. The player however has a bunch of items at its disposal: potion,super potion, hyper potion, ultra potion, ... , revive, max revive, antidote, .... , ether, elixer, ..., oran berry, pecha berry, ..., x speed, x attack, ..., poke ball, premier ball, great ball,...   .Now not all those items are applicable all the time but 30 seems like a good estimate. It sounds like a lot but if you can use items on pkmn in the party suddenly that number begins to look very small. So suddenly your move options become 39*11=429 pairs of options. 

 

Also how often do you think a kill occurs? I mean my poor lvl 36 pheromosa cannot OHKO a lvl 31 swirlix with jump kick or stomp which is at that point my best possible move or a steelix and this is 137 base attack then you have sturdy, focus sash, screens, ... . So for a lot of turns you will have to calculate potential side effects of moves. Also 1 HP can matter, not only for clutch survival but for things like belly drum, blaze,... so you have to be very carefull with what you generalize into one gamestate. So you still have to calculate a lot of different possible gamestates.

 

For a final food of thought how can the AI deal with metronome?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...