Print Page - Experimental Ranking System

Planet M.U.L.E.

M.U.L.E. Community => Announcements => Topic started by: Peter on October 13, 2010, 11:45

Title: Experimental Ranking System
Post by: Peter on October 13, 2010, 11:45

We have a new experimental ranking system. You can see the rankings for all players in the new system at our ranking page (http://www.planetmule.com/ranks).

Player's now have an assigned skill value. This value starts at 100 for new players and can go up and down depending on the outcome of each game session. Winning matches will increase your skill level while losing can decrease it. The system determines how your skill level changes after each match using an adopted TrueSkill (http://research.microsoft.com/en-us/projects/trueskill/) algorithm.

You can read more about this ranking algorithm if you follow the link, but in short it works as follows. Each player has an average skill value and an uncertainty value. The uncertainty tells us how much the system trusts that the average skill is actually correct. Initially the system will guess your skill is 100, but it says that with a big uncertainty. After playing a few matches your uncertainty will shrink as the system can compare you with other players and your mean will move towards your proper skill. If 4 new players compete in a match we get a ranking from 1st to 4th. We can then say with a little more confidence than before that the 1st player is probably above average, 2nd is only slightly above, 3rd may be a little below and 4th is further below average. Their skill levels will be adjusted accordingly and their uncertainty is decreased. The system is still more uncertain about the top and bottom players because maybe they could be miles above or below average, you can't really tell from only this match.

How much the skill level changes depends on how surprising the outcome is. A player with a low skill level winning over a player with high skill level will get a large skill increase. She is probably better than we previously thought. On the contrary, if the skilled player wins it's not surprising and the skill levels will not change much.

Another example is a low ranked player getting 3rd place in a 4 player game of otherwise skilled players. This can still increase the skill of the 3rd player because she did beat one high skill opponent. In the same game the 1st player will get a normal increase for beating opponents at mostly the same skill.

Compared to TrueSkill we use a lower uncertainty to begin with. This means that the system is quite confident that new players are average and they will need to prove in several games that they get 1st or 2nd position. In practice this means that the rankings change at a slow pace. However, a new player can get to the top 20 in only a few games if she's playing against skilled opponents. It's also possible to reach the top by playing and winning several games against average and above average players.

We have chosen to weight games by the number of human players. A 4 player game will give 100% of the skill points, a 3 player game 60% and 2 player only 10%. AI opponents does not count anything for ranking so losing or winning to them does not affect your skill.

If a player abandons a game and the game is still finished legally it will affect the abandoning player's ranking.

All decimal places count in the skill level even if only the integral part is visible. When players get to the higher skill levels the increases and decreases will decline and the skill can never go above 1000.

We're working to show how each game affected your skill level. It will be linked from the player names in the ranking page and display plus or minus (+/-) the number of skill points per game.

The outcome of new games are not added to the rankings yet.

Note that this ranking system is in it's experimental stage and will likely be modified. We'd like your feedback so please discuss it in this forum thread:

Do you think the new ranking is fair or not?
Do you have any questions?
Would you like to see another way of ranking?

Regards,
The Dev Team

Title: Re: Experimental Ranking System
Post by: piete on October 13, 2010, 12:33

Interesting update. (Congrats Kipley, you finally made it! ;)) Not being able to study the system more deeply at the moment, my personal ranking seems a bit high considering that I have played very few games lately.

Is there any weighing favouring more recent games? If not, I think even good players need to show constantly their skills, so that historical success is not a proof of current skills.

What is very much welcome addition is that not only victories count, which really gives chance to players that can't invest as much of their time as others (when I retire and my kids are grown up, I'll have a chance to play more...)

Title: Re: Experimental Ranking System
Post by: MuleyMan on October 13, 2010, 17:00

I have not studied ranking algorithms and don't intend to.
So my ideas may clearly not work, but here they are.

There are players highly ranked and have not played in 6 months (ex. BaronHelix). Use last 90 days on ranking or have games over 90 days old have less weight in the ranking. I started with Muleyman, got typical abandons and losses while lerning the game, then started a nic (Wheetfarmer) that is higher ranked than MM even tho I have only played a few games recently with Wheet. I went back to Muleyman since total games played mattered and wanted to concentrate stats mainly on one character.
Also, Abandons don't seem to matter much in rank (ex. BaronHelix)
I think tho that the main problem is using all stats and not limiting to a 2-3 month timeframe.

Title: Re: Experimental Ranking System
Post by: data2008 on October 13, 2010, 17:13

Thanks for the comments!

We plan to do the "diminish rating time-factor" next.
That means the Ranking displays the ranking among the currently active and consistently playing mule folks.
Like in tennis, if you do not show up in the future, your ranking will slowly drop over time.

So the new ranking is an actual measurement between competing players,
while we still keep the alltime-hiscore list which shows overall achievement regardless when it happended
(more like the alltime olympic medal statistic).

Since we store all information of each game, we can even reconstruct, who was in the top1000 at what time, so you see players from the past having much higher positions at one point in time, while they currently do not play anymore and therefore have lost their topspot ranking position. We could therefore display for each player a detailed page how his ranking and skill evolved over time and what players he played against together with the ranking outcome. A very basic page exist, if you click on a players name in the ranking list, but it will be much more improved.

Title: Re: Experimental Ranking System
Post by: kipley on October 13, 2010, 22:39

Quote from: piete on October 13, 2010, 12:33

(Congrats Kipley, you finally made it! ;))

Thanks, piete! Though my being at the top of the new rankings may indicate a problem with the system. I know for a fact there's a handful of players out there that can consistently outplay me. This is not false modesty on my part, it's just the way it is.

And yet under this new system I outrank these players. I'm not certain why. Perhaps it's because these same players also tend to be very particular about whom they play against... as they are also players that go out of their way to only play in games with experienced opponents (and often highly skilled opponents, as well). I'm not nearly as picky... when I want to play Mule, I tend to want to play now, and so I play against whatever first three players show up, whether they be experienced or complete newbies. The only folks I refuse to play against are those that have an excessive amount of abandons (or certain "infamous" players mentioned ad naseum in these forums).

So perhaps the new system overly rewards playing against weaker opponents? Because that's the only reason I can think of why I should outrank some of the worthier competitors out there.

Title: Re: Experimental Ranking System
Post by: Mt-Wampus on October 13, 2010, 23:02

Who cares about rankings! just play for the fun of the game.

Title: Re: Experimental Ranking System
Post by: Death_Mule17 on October 13, 2010, 23:30

Quote

So perhaps the new system overly rewards playing against weaker opponents?

I think your correct kip, as i noticed my rank shot past 120/ from current #20...I usually play as the lowest ranked in my nightly games with rodz an friends.

I guess im not as good at mule as i thought, oh well.

ps. @mt.wampus , I CARE ABOUT THE RANKINGS!!, I played for the fun of it back in the 80s... and never dreamed of being ranked on a global scale.
you know , people can still have fun while climbing the ladder against the best of the best and i think its a great addition to mule!

Title: Re: Experimental Ranking System
Post by: Mt-Wampus on October 14, 2010, 04:02

Rankings are based on number of wins. Guys who have jobs and lives have no shot at a high ranking while the guy who spends his entire life on Planetmule is "Worlclass". Just dont agree with that. There is no realistic way to rank people that is fair to all. Hence the fact that i could care less about rankings.

Title: Re: Experimental Ranking System
Post by: Chuckie Chuck on October 14, 2010, 04:14

I saw my ranking drop slightly, but I also feel it is more accurately reflective of my skill vs others. There might be a feature to ad. Be able to view both "Overall average ranking for duration of membership", and a second sort that ranks based on a 3 month history. Perhaps include two ranking columns in the game engine that will reflect both sorts, so people can know both the players long term success and current average.

Title: Re: Experimental Ranking System
Post by: rodz on October 14, 2010, 05:41

i have found my ranking drop from 1 to 5 with new system.
fair enough i suppose if this is the new system.
i will have trouble improving my ranking due to the time differences between me (nz) and other countries.
to play i need to play when i can and this tends to be out of sync with other top ranked players.
i am not to concerned because i have achieved what i set out to do initially which was to #1 enjoy playing mule,#2 play against real people. #3 play to the best of my ability

all i can say is thanks for making mule a game we can all enjoy and play against people from all over the world

Title: Re: Experimental Ranking System
Post by: piete on October 14, 2010, 10:57

Quote from: rodz on October 14, 2010, 05:41

to play i need to play when i can and this tends to be out of sync with other top ranked players.

You're out of sync anyway because of the lag you cause! ;)

Quote from: rodz on October 14, 2010, 05:41

i will have trouble improving my ranking due to the time differences between me (nz) and other countries.

Not exactly true, every victory still improves your skill score, although maybe in smaller steps. Keeping up your pace of games and winning majority of them still will get you to 1st place eventually.

Title: Re: Experimental Ranking System
Post by: dynadan on October 14, 2010, 13:04

The new system seems like a good idea. I can't really tell if the system is fairly implemented or not yet. Need to see some more cause and effect.

I do agree with Piete and Muley Man about weighting the more recent games heavier. And/Or make the rating go down 1 point every day of no playing. Whatever the system, ideally you don't want people to get to the top and then be able to camp there without playing.

Anyway great addition yet again. Thanks for all the work.

Title: Re: Experimental Ranking System
Post by: Bumbes on October 14, 2010, 13:23

Well, I like the new ranking system, especially with the coming "diminish rating time-factor".
Pre version 1.3.3 host advantages will disappear in the stats slowly, at least for that ranking.

The opponents statistic list is very interesting to watch also, keep up the good work.

Title: Re: Experimental Ranking System
Post by: Rhodan on October 14, 2010, 17:19

I propose when the new ranking system goes live to reset everyone to zero.

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on October 14, 2010, 18:42

Quote from: Rhodan on October 14, 2010, 17:19

I propose when the new ranking system goes live to reset everyone to zero.

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on October 14, 2010, 22:23

I think the current ranking system is meaningless. So, if Planet MULE changes to TrueSkill, all the better. However...

I started reading the background articles. Very quickly, I stumbled on to this, "More recently the XBox system has stated that it's explicitly for matchmaking, with the goal being to always try and match up players at nearly the same skill level. It's also used for hierarchy (or "leaderboards" as it's described in the TrueSkill docs), but that's clearly a subsidiary purpose."[1]

Planet MULE makes no effort to match players, nor does Planet MULE have enough players to warrant trying to do so. If Planet MULE isn't matching, then very skilled players can and will play unskilled players. When this happens large changes in assigned skill values can occur. One of the results of this phenomenon will be highly ranked players will be even more less likely to play new players--they now risk their ranks. Planet MULE is looking for a ranking system. The only thing I see happening by using a matchmaking system is a greater divide between regulars and new players. If anything Planet MULE needs a handicapping system with a ranking component.

Related to the "diminish rating time-factor": I think it's extremely important to keep time since played and number of games played separate. Just because someone plays a lot has no reference to their skill.

1. http://www.lifewithalacrity.com/2006/01/ranking_systems.html

Title: Re: Experimental Ranking System
Post by: leahcim99 on October 14, 2010, 23:02

+2 - Reset all to 0 if we go with new system.

With the recent freeze up issue, some of us have 20+ abandons that we should not have.

Title: Re: Experimental Ranking System
Post by: Death_Mule17 on October 14, 2010, 23:51

Anyone know WHEN the ranking system will be reset, as im about to start a new account due to all my abandoneds. Im finding it harder to get people to join my games....i have about 63 abandoneds out of 379games , an i still have never quit or closed a game EVER. But new players to pm probably think im like akire1 or sombody. It would be great to add a DC column so players know the diff between a quitter and a guy who has connection issues from time to time....

ps. I have had no connection issues in 2weeks, ive learned how to adjust router settings from the help of players of pm(thx guys)...so please join my games , there 100% now (lurkers welcome)

Your host with the most..
DM

Title: Re: Experimental Ranking System
Post by: Homie The Clown on October 15, 2010, 09:30

May want to check you new ranks system.....

Found something vary vary funny. The know cheaters (Akire1,Uschi,Simpla the Best) somehow get two games for the price of one.

Look up ranks/ akire1 and you will find that the same date/game/time/sector map are listed twice.....?

maybe for other player to, but i find it funny that the know cheaters would have something like this working for them.

just thought i would let you know.

Homie Don't Play that.......

Title: Re: Experimental Ranking System
Post by: Peter on October 15, 2010, 11:50

Quote from: Homie The Clown on October 15, 2010, 09:30

Look up ranks/ akire1 and you will find that the same date/game/time/sector map are listed twice.....?

It's fixed now. The page previously showed duplicate rows for games with two bots, but it did not affect the skill points.

Title: Re: Experimental Ranking System
Post by: Mt-Wampus on October 15, 2010, 14:19

Atleast the new rankings system did get the #1 guy correct ! Kipley should be ranked #1. His winning % and the quality of players he takes on makes him deserving plus he plays straight up! No AI or picking on new guys. No disrespect to Rodz, Rhodan, Brad or the other top guys but they are leading the pack based on the unreal amount of games played and wins as a result. I do rate guys like Rhodan,Rodz and Brad as top players though. Have played them all and they are great players. Just more impressed by a few guys that have played half as many games is all. I do think the new system has dropped the ball on guys like Simpla and Uschi though! No way those guys should even crack the top 500!

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on October 15, 2010, 19:45

Quote from: C64 nostalgia on October 14, 2010, 22:23

I think the current ranking system is meaningless. So, if Planet MULE changes to TrueSkill, all the better. However...

Planet MULE makes no effort to match players, nor does Planet MULE have enough players to warrant trying to do so. If Planet MULE isn't matching, then very skilled players can and will play unskilled players. When this happens large changes in assigned skill values can occur. One of the results of this phenomenon will be highly ranked players will be even more less likely to play new players--they now risk their ranks. Planet MULE is looking for a ranking system. The only thing I see happening by using a matchmaking system is a greater divide between regulars and new players. If anything Planet MULE needs a handicapping system with a ranking component.

My above paragraph is muddled. I want to try to be more clear and elaborate.

So far, most commenters are looking at the static qualities of the proposed ranking system. No one has mentioned the dynamic aspects of day-to-day use. The real world movement and change of rankings strongly warrants further study.

Highly ranked players will have a lot to lose when they play much lower ranked players. If they lose in games like these, their rank will be hammered very quickly. MULE being a game with a large luck component; the possibility of losing isn't small. Losing streaks aren't rare. Good players will be even more less likely to want to play bad (low ranked) players. The pool of players will become more segregated and thus effectively smaller. This is very bad for an already segregated and small pool of players.

Additionally, new players who don't play like experienced players will be reviled and avoided even more. The games with unexpected behavior (read: the stuff new players do) are the ones experienced players are more likely to lose.

TrueSkill is about matchmaking. Planet MULE does not have enough players to use TrueSkill for its best and primary purpose.

Title: Re: Experimental Ranking System
Post by: rodz on October 15, 2010, 20:02

after some thought on the matter (yes it hurt) i feel some changes need to be made if we change to this new ranking system, they are

1. all stats reset to 0 (except hosting reliability %)
2. no land auctions in round 1 ( more luck than skill to get one)
3. no land win/loss before round 4 (again luck not skill and can dramatically affect game outcome)

other than that i feel the game is as perfect as you can make it and thank all those involved for the great work you are doing.

ps. don't worry piete if i ever have the pleasure of playing you again i will get my woolly friends to double the lag you get lol

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on October 15, 2010, 22:27

Quote from: rodz on October 15, 2010, 20:02

Replied to in Changing the game to accommodate a ranking system dishonors M.U.L.E. (I really do need to tone down my sensationalist thread subjects. :) Please forgive me.)
http://www.planetmule.com/forum?topic=1128.0

Title: Re: Experimental Ranking System
Post by: doktorbuzzo on October 16, 2010, 04:33

As long as we're making suggestions about the new ranking system, how about incorporating it into the initial turn-order determination in games? The order can still be randomly determined, but don't use a uniform probability distribution to draw the starting order. For example, use player ranks to make individual draws on differently parameterized probability distributions thusly:

1. Let each player's unit-variance normal distribution be centered about a mean value that increases with improved player rank (so a player ranked 3rd will have a distribution centered about a higher value than a player ranked 33rd).

2. Draw a small "fudge factor" value from a fixed (identical for all players and for all games) uniform distribution with a range of, say 0 to 0.1 (this range should be fairly small) for each player. Call this value U(player).

3. Draw one random value from each player's normal distribution (1 above, normal). Call this value N(player).

4. Add the two random values to obtain the rank ordering for all players

R(player) = U(player) + N(player)

5. The game's starting order is then determined according to the descending order of R(player) values. So the player with the highest R(player) value starts in first, the player with the next highest R(player) value starts in second, etc., like so:

1. Player1, R(Player1) = a
2. Player2, R(Player2) = b < a
3. Player3, R(Player3) = c < b
4. Player4, R(Player4) = d < c

Ideally this system will serve as a meaningful but not entirely predictable (or pre-ordained) handicapping system to "seed" players into a starting order before a game. Players of similar rankings will see the greatest amount of variation in their starting positions, while players of widely varied rankings will mostly find themselves ordered with the lowest ranked player starting fourth and the highest ranked player starting first. The addition of the U(player) value should prevent the system from becoming too rigidly deterministic and avoid unduly penalizing highly ranked players.

Title: Re: Experimental Ranking System
Post by: GambitTime on October 16, 2010, 09:38

I don't mind a new ranking system per se, however awarding people points for 2nd and 3rd place has a MAJOR flaw:

I would make certain plays during games to win or bust. Sell a plot of land late, buy stite @108, etc. Sometimes I take major gambles to WIN, not to play it safe and hope to come in second. I have finished last many times when I could have just played it out and finished 2nd.

I won a game last week making almost no ore or stite, but instead took all 4 river plots and won with food and energy; against very good players ranked in the top 100. If a statagy like that bombs on me and I finish 4th and lose ranking, where is my motivation to try to experiment with something new?

The basic premise is wrong, the 4th place player is NOT nessessarily worse than the 2nd place guy and maybe even played better.

If there is a new rating system, still award points for winning and nothing else. Don't reward a player for just being the first loser.

One way to do it is by rewarding the winning player extra points for how much he wins by.
Winning by 5000 is a lot more impressive than winning by 500. Although frankly I am happy with the ranking system the way it is.

Title: Re: Experimental Ranking System
Post by: GambitTime on October 16, 2010, 10:04

My post above is my intellectual argument. Here is my emotional one:

PLEASE, PLEASE, PLEASE, do not go to a Truskill type of ranking system. I have played other games that use it and I can't stand it.

It especially will not work in MULE because the sample of games isn't nearly large enough.

MULE is about winning people!

WINNING! I don't want to feel good about coming in 2nd.

Title: Re: Experimental Ranking System
Post by: Chuckie Chuck on October 16, 2010, 15:51

Quote

Replied to in Changing the game to accommodate a ranking system dishonors M.U.L.E. (I really do need to tone down my sensationalist thread subjects. Please forgive me.)

I am in agreement with C64nostalgia. I like the idea of the new ranking system, but I don't believe the game should be changed to accomodate it to that extent. The idea here, was to make the game as close to the original version as possible, and I'd like to see the ranking system reflect based on that design, not based on a intentionally predictable stategy due to strict game parameters that make random events less random.

I say, thumbs up to the new ranking system, thumbs down to changing the game.

I disagree with Gambit about giving 2nd and 3rd place points though. Yeah, gambling is part of the game, but here is a thought, this ranking system should also show to some extent how successful a gambler you are. Sometimes, I make a mistake early on that I can't gamble my way out of, but I do my best to try, and if I get 2nd at the end, I'd like to get credit. If I gamble and loose, well, it was a bad gamble! So I didn't get any new points, wouldn't be the worst thing that could happen.

Reset stats to zero when the new ranking system is finalized, I think is just common sense. Let's work out the ranking system first.

Title: Re: Experimental Ranking System
Post by: Chuckie Chuck on October 16, 2010, 15:55

Gambling skill

In a casino enviroment, in Vegas, when you loose a hand, do they say, you get to keep your bet?

Title: Re: Experimental Ranking System
Post by: Intergalactic Mole on October 16, 2010, 23:50

IMO there should be no "ranking system" at all for MULE, only a high score list.. just like most other games from the 80s. I think it creates too much animosity.

Title: Re: Experimental Ranking System
Post by: Mt-Wampus on October 17, 2010, 00:15

Quote from: Intergalactic Mole on October 16, 2010, 23:50

IMO there should be no "ranking system" at all for MULE, only a high score list.. just like most other games from the 80s. I think it creates too much animosity.

Couldnt agree more!

Title: Re: Experimental Ranking System
Post by: rodz on October 17, 2010, 07:39

the reason for my comments on changes if new system is adopted are

if we are to make the ranking system skill related then skill not luck has to be foremost and this is not always the case with plots sales on 1st round and losing plots early.
i really can't see the new system making a lot of difference but will cede to the majority

maybe the easy way is to divide players total score by the number of games played and this will give a average score and rank on this. every player will continue to try even when in last to keep their average up.( just another thought) and it did hurt as much as last time.

in the meantime i will just enjoy playing mule
thanks again

Title: Re: Experimental Ranking System
Post by: mikman on October 18, 2010, 02:50

I definitely like the idea of a high score list. Keeping it simple is always the best way in my books :-) That being said, I also really like all the stats that the current and new ranking system have. I like to look at the numbers and compare different stuff and come to my own conclusions about who is really ranked the highest ;-)

Title: Re: Experimental Ranking System
Post by: dynadan on October 25, 2010, 22:32

While I respect where the "High Score List" requests are coming from....It makes no damn sense. High scores are already listed....and of the scores listed almost all the high scores are collusion games in 1 form or another.

Stats are fun. The more stats and ability to study them (graphs, charts, etc.) the more fun people have with them. Change is also fun. That is the reason the old system is so unsatisfying to people. The people at the top can not be challenged for months or possibly years at this point. I think a system where positions are changing all the time is the most exciting for people to keep track of and to compete in.

I think the true skill system works pretty good, although i do understand people who say only 1st place should be awarded. The problem i see with that is it is too easy for people to "screw around" or purposely mess up the game if they think they are not going to win. A full position based ranking system will help people stay motivated and competing even though 1st place may be out of reach.

I don't think Rodz suggestions were to change the game to help the ranking system, but merely to improve the game. The plot take away/give event would make the game much better if it was addressed. Although, we are all bound by the same rules of chance and in the long run skill will be the predominate factor. I actually think the 1st round auctions make the game more interesting and closer to the original and while i was against it at first, I have since changed my mind.

I don't think the stats should be reset to zero. First, I think its neat that every game ever played can be looked up and studied. We have all that information still why not use it. Second, a TrueSkill ranking system takes a lot of games to start giving accurate numbers, we already have a pretty good base why not use it? As part of the stats not getting reset it is imperitive that we alter the system to provide a more dynamic system with recent games mattering more than older games and/or letting the ranking diminish with time....this serves two purposes.... It doesn't punish people for the learning curve that everyone must go through when they start playing, but no longer affects their current quality as a player. And secondly it keeps people from "camping" at the top of the leaderboard resting on their previous accomplishments (whether gained fairly or not). The more movement a leaderboard has the more people will be enjoying it.

Doktor Buzzo also had an interesting idea on everybody's ranking for the 1st turn, but i think that is a very different issue and should be continued in another thread.

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on October 26, 2010, 20:41

My ideal ranking system...

A player's ranking can go up or down based that players own performance

Number of games played has no direct relation to rank

First place wins are much more important than other places

Colony scores have an influence on ranking but only weakly (applies to all players in a game)

The skill of your opponents affects the importance of your win or loss

Make games that start with AI's worthless

The previous record of play is not congruent. The wins and losses are a result of many different versions of Planet MULE. Some versions of Planet MULE radically changed play from a previous version. A classic example are the versions where smithore and crystite never spoiled. These never-spoiling ore games are what account for a majority of "high scores." A new ranking system needs to start fresh (and potentially "reset" every major gameplay change). As far as older games not being counted as much as newer games: if a solid ranking system is picked and "reset" periodically, this aspect should not matter much to current rankings. The old games would really only matter to statistics such as win/loss and related... the yummy stats. (Notice the quotes around reset. Reset is a flexible term.)

I still maintain Planet MULE should become more like the original M.U.L.E. -- as much so, as possible. So to change Planet MULE purely to accommodate a ranking system is an anathema. (Luck is random boys. By definition, it affects everyone equally over time. Random is good because it creates variation, especially in those ~~snobby~~ high-skill games. M.U.L.E. has never tried to be consistent outside of the boundaries created by its rules and the "luck" inherent to its design.)

HOWEVER, in the spirit of the beginner's species. A handicapping system would be an awesome (rocket-ship-awesome) addition. This system ought to be more important than a ranking system. Our player pool is small. Highly skilled players regularly play low or unskilled players. You either wait (sometimes for hours), or you play as soon as you have 4 players who at least know the basics (again, sometimes for hours)... Handicapping could help equalize players -- making play more enjoyable for everyone. Equalizing players would have a cool side effect, as well. Other plays could open up beside the standard ore then crystite flow. A compensated lower skilled player could try novel strategies and in doing so affect the dynamic of the game. And, this is good because the standard ore then crystite flow is becoming boring. This method is about as refined as it can be... reliable, straightforward, and somewhat tiresome wins.

The ranking system should follow a handicapping system. If we have a good handicapping system, something like a modified TrueSkill could actually work to create a leaderboard here.

"We term a match “uninteresting” if the chances of winning for the participating players are very unbalanced – very few people enjoy playing a match they cannot win or cannot lose. Conversely, matches which have a relatively even chance of any participant winning are deemed “interesting” matches." [1]

This is the paradigm of TrueSkill. But, the way Planet MULE is now, where players regularly play opponents way above or below their skill level, will not fit into a system designed to track players playing in games with evenly matched opponents. Planet MULE using TrueSkill will lead to wild (and infuriating) changes in rank. It will also scare highly ranked players away from playing much lower ranked players. As I have said before, we do not have enough players to use this idea. Handicapping is the only way to make "interesting" matches within Planet MULE's players.

I actually really wanted to support TrueSkill (or anything really) because it would be relatively simple to push it through to replace a very flawed ranking system. Unfortunately, TrueSkill and Planet MULE as it is now would be a bad match.

[1] http://research.microsoft.com/en-us/projects/trueskill/details.aspx

[edit: Added "Make games that start with AI's worthless" Big mistake forgetting this one. I can't believe I left this one out.

As no replies have been made, I ending up rewriting parts of this post.]

Title: Re: Experimental Ranking System
Post by: dynadan on October 27, 2010, 22:04

@c64 nostalgia

Your ideal ranking system seems to be almost exactly what the TrusSkill system does except for taking into account the colony score.(which i think is a good idea, I just don't want to be the one to figure out the exact formula) AI games aren't completely worthless but they certainly are valued much less than 4 human games.

I still don't understand why you think the TrueSkill system discourages better players from playing worse players. A top player doesn't lose to a bottom player very often, and when it happens I don't see the problem with losing more rating points.

One last thing, the handicapping idea is an interesting thought, but it really seems like it should have been suggested by someone who hasn't been bashing others for trying to change the game to accommodate a ranking system. I can't imagine a bigger change than a handicapping system to the original Mule.

I realize that TrueSkill was designed to help with match-making, but that does not mean it doesn't work equally well as a ranking system... again i think the only change it needs is for a players rating to gradually drop if they do not play. (this hurts me as much as anyone, since due to the arrival of my baby I no longer have the time to play like i used to)

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on October 28, 2010, 09:56

Has anyone actually read Microsoft's TrueSkill (http://research.microsoft.com/en-us/projects/trueskill/) webpages and any of their links for more information? Collective Choice: Competitive Ranking Systems (http://www.lifewithalacrity.com/2006/01/ranking_systems.html) is a good overview that's pointed out.

Quote from: dynadan on October 27, 2010, 22:04

My ideal ranking system... with an *addition meets TrueSkill

A player's ranking can go up or down based that players own performance Yes

Number of games played has no direct relation to rank Yes

First place wins are much more important than other places NO

*Margin of a win affects importance of win NO (Technically it could be yes depending on implementation, but for what I want NO.)

Colony scores have an influence on ranking but only weakly (applies to all players in a game) NO

The skill of your opponents affects the importance of your win or loss Yes

Make games that start with AI's worthless Not applicable

TrueSkill isn't even close to my ideal ranking system.

Quote from: dynadan on October 27, 2010, 22:04

I still don't understand why you think the TrueSkill system discourages better players from playing worse players. A top player doesn't lose to a bottom player very often...

Why do you think players are complaining about luck-based elements of M.U.L.E.? Luck makes any player a wildcard.

Quote from: dynadan on October 27, 2010, 22:04

One last thing, the handicapping idea is an interesting thought, but it really seems like it should have been suggested by someone who hasn't been bashing others for trying to change the game to accommodate a ranking system. I can't imagine a bigger change than a handicapping system to the original Mule.

Cool, someone called me on this. I wanted to explain this. Again, I still deeply believe Planet MULE should be as similar as possible to the original M.U.L.E.'s. But there are certain things that had to change... such as lag related issues and the creation of internet play. (A quick aside: if Planet MULE makes changes from the original, the change should be systematic and not isolated. So if you raise one price, all prices should rise to preserve the original balance in gameplay.) Original M.U.L.E. had a beginner species designed specifically to handicap players. Planet MULE's omission of any equivalent goes against the original design. The irony is Planet MULE has a vast record of player information. The C-64 version played in less than 64 kilobytes and knew nothing of a particular player at the eve of a game. Handicapping is specifically an area where the amazing improvement in computers could be a highly beneficial and appropriate change.

Quote from: dynadan on October 27, 2010, 22:04

I realize that TrueSkill was designed to help with match-making, but that does not mean it doesn't work equally well as a ranking system...

If this is true, please tell me why? This request goes to anyone, not just dynadan. Feel free to explain it slowly and carefully. Maybe, I misunderstood something or completely overlooked an important bit... I am sincerely interested in learning about ranking systems in general.

And finally to leave you with something I found funny. A question and answer from the TrueSkill FAQ (http://research.microsoft.com/en-us/projects/trueskill/faq.aspx):

"Q: I am among the top 100 players in the world in my game mode. Why do I usually wait longer in the matchmaking lobby than my friend JoeDoe who is an average skill player?

A: This has an easy explanation: There are simply not enough players of your calibre available at any time! Remember that Xbox Live is a worldwide service, so there are perhaps only 1000 players that would be a perfect match for you. Living in 24 different time zones. The only alternative is to match you with players who are much less skilled and sacrifice match quality for waiting time. And this would ruin both their and your experience on Xbox Live. You see: being a top player has its price!"

I wonder what the questioner would think of wait times on Planet MULE.

[edit: expanded "If this is true, please tell me why? Maybe, I don't understand or missed something..."]

Title: Re: Experimental Ranking System
Post by: dynadan on October 29, 2010, 23:43

Ok C64 Nostalgia, I think we are actually very close on our opinions here, so hopefully we can get on the same page. Please excuse the lack of quotes in my post.

Yes I have read the majority of the info on the TrueSkill system. (although i must admit a lot of the math simply makes my eyes unfocus) I have been involved in several other projects similar to planet mule as far as trying to come up with fair leaderboard systems, and everyone (including me) always assume that implementing a fair system is easier than it actually is. The TrueSkill system may not be the absolute best system possible, but as far as I know it is the best system that actually exists in the real world. Please feel free to check out the other systems listed on the xbox website, there were none that suits planet mule better. Systems like ELO are not really appropriate because they deal with games that are 100% skill like chess.

I am fairly sure that the TrueSkill system does award 1st place finishes much more than other positions. Although it is possible to still gain rating points for finishing 2nd depending on the skill of your opponents....it is also possible to lose rating points for 2nd, again depending on the skill of your opponents.

I agree that in a perfect system margin of win should also have a small affect on your rating. I also agree that giving total Colony score a weak affect on rating would be nice as well. However, I haven't the slightest clue(I doubt you do either) how the developers would add this into the current TrueSkill formula.

Still seems to me that the TrueSkill system is very close to what you actually want. If you can figure out how to implement those 2 great suggestions into the formula please post the math and maybe the developers can try it out.

I disagree with your assertion that people complain about the "luck" elements of Mule. The luck elements are what makes Mule such a great game. Without the luck elements there would be very little replayability to the game. i.e. are the pirates going to come this turn, where is the crystite, what price will crystite be the last turn, is a mountain going to move in a quake. Mule behaves like many combo luck/skill games (Poker, Risk, etc) On any given game there is a chance that anyone can win, but in the long run the more skillful player will always end up winning more. The plot take away/give event to me is a very different issue, and we should debate that in another thread.

The TrueSkill rating system was designed to solve the problem of fairly matching up players, but in doing so it also created a fair way to rank players on a leaderboard. Basically the idea was to create a rating system so that the 1000's of players playing games on the xbox or playstation wouldn't just join a random sampling of players, but instead would be able to join people close to their skill level. In effect creating an enviroment where the new players weren't destroyed by the veterans until they quit, and to keep things interesting to the veterans by testing their skills against other veterans. Most of the games it was made to use in had a very high skill to luck ratio. Halo or Call of Duty are much more skill based games than games like Mule, Civilization Revolution, or Poker. But even with high luck factors given enough games the better players will be listed at the top of the TS rating system. I don't really know what else I can add...Think of it like a chess rating system, the highest rated player is considered the best player.

As far as the question and answer from the TS FAQ: This IS a problem with the TS system in regards to matchmaking. I have been near the top of several xbox leaderboards that use the TS system and as many of you know I also tend to play at a lower volume time of day, so I have experienced this issue and can tell you it is really flawed. However I don't know how this creates any sort of issue for us since we are only interested in using TrueSkill for leaderboard rankings.

In regards to your handicapping idea: I am neither for or against the idea in principle. I think this is definately an idea that could at least be debated for Mule2. But I think trying to do this to the original version would be opening up a new can of worms. I would enjoy seeing some specifics on what you had in mind.

Ok now here goes what I would specifically do to improve the TrueSkill system as a ranking system for Planet Mule:
I really liked C64Nostalgia's 2 suggestions to add very small weight to margin of victory, and the colony score. I don't know how you would fit this into the rating formula however. Maybe some of you math geniuses can figure it out?
My number one request (and one that i think can be easily put into the formula) would be for every week of inactivity drop the players rating by 2 points. 2 points is not a lot but over time if a player stops playing it will drop him off of being relevant on the leaderboard. These points should be dealt with separately than the TrueSkill rating points, and if a player returns from a long absence they should be able to earn them back at twice the rate they were taken away. (example: if a player does not play a game for 10 weeks he will have lost 20 points, if he then plays a game he will get 4 of the points back. The next week if he plays 5 more games he will still gain only 4 more points back. If he plays at least 1 game a week for 5 weeks straight he will have regained the entire 20 points he lost due to inactivity.) This will help to give the current batch of players the ability to climb the leaderboard, and prevent people that only played in the 1st month or two that planetmule was created and haven't played since from being eternally in the top 20.

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on October 30, 2010, 06:58

Hmm... where to start...

On further thought:

My ideal ranking system... 10/29/2010 version
A player's ranking can go up or down based on individual performance

The skill of your opponents affects the importance of your win or loss

Number of games played has no direct relation to rank

First place wins are much more important than other places

Margin of a win weakly affects importance of win

Last place finish has a very weak penalty

Colony scores have a weak influence on the rank calculation (for all players)

Make games that start with AI's worthless

Quote from: dynadan on October 29, 2010, 23:43

Yes I have read the majority of the info on the TrueSkill system. (although i must admit a lot of the math simply makes my eyes unfocus) I have been involved in several other projects similar to planet mule as far as trying to come up with fair leaderboard systems, and everyone (including me) always assume that implementing a fair system is easier than it actually is. The TrueSkill system may not be the absolute best system possible, but as far as I know it is the best system that actually exists in the real world. Please feel free to check out the other systems listed on the xbox website, there were none that suits planet mule better. Systems like ELO are not really appropriate because they deal with games that are 100% skill like chess.

I am fairly sure that the TrueSkill system does award 1st place finishes much more than other positions. Although it is possible to still gain rating points for finishing 2nd depending on the skill of your opponents....it is also possible to lose rating points for 2nd, again depending on the skill of your opponents.

I agree that in a perfect system margin of win should also have a small affect on your rating. I also agree that giving total Colony score a weak affect on rating would be nice as well. However, I haven't the slightest clue(I doubt you do either) how the developers would add this into the current TrueSkill formula.

Still seems to me that the TrueSkill system is very close to what you actually want. ...

Did you read Collective Choice: Competitive Ranking Systems (http://www.lifewithalacrity.com/2006/01/ranking_systems.html)? The TrueSkill FAQ (http://research.microsoft.com/en-us/projects/trueskill/faq.aspx) points it out after it lists other ranking systems.

From the above article: "The various lessons learned at Days of Wonder underline two basic ideas about rankings. First, even with a well-studied system like ELO, there's still a lot to understand, and, second, any ranking system needs to reflect the specifics of what it's ranking -- and what its purpose is."

Days of Wonder modified ELO to suit their game. So for us, we have to decide what in a MULE game we want recognized and what our ranking are for... My ideal ranking system lays out what I want recognized. And, the purpose for our ranking system: We want our rankings to show who is better.

So, let's take TrueSkill... TrueSkill's purpose is to find similar players to make "interesting" matches. TrueSkill's specifics are almost only who wins or loses (with many winners and losers in a multiplayer game). It has no weighting for first, second, third, or fourth place. In a four player game (assuming everyone has a equal skill points and uncertainty), first will gain the most skill points, last will lose the same amount, and second and third not so many in between. TrueSkill is very simple in terms of its inputs from a game. It's designed to be general purpose.

I want Planet MULE's ranking system to take more data and do more with that data than TrueSkill does with its games. I don't know if it's possible to modify TrueSkill to do what I want. I guess you could modify the skill points after the calculations. But to do that right, it seems, you would have to make a piggy-backed system that adds extra points for first place, takes away a much smaller amount for last place, adds bonus points for a big win, and adds or takes points for a particular colony score. Although, it seems if you play with skill values that much, something has to break. Quick aside: ELO, glicko, and TrueSkill are all popular skill-based ranking systems.

This is a call to all mathematicians, statisticians, and the like. Can TrueSkill be modified, does a better system exist to start with, or can you make Planet MULE something nifty that does everything on my wish list?

Quote from: dynadan on October 29, 2010, 23:43

I disagree with your assertion that people complain about the "luck" elements of Mule. The luck elements are what makes Mule such a great game. Without the luck elements there would be very little replayability to the game. i.e. are the pirates going to come this turn, where is the crystite, what price will crystite be the last turn, is a mountain going to move in a quake. Mule behaves like many combo luck/skill games (Poker, Risk, etc) On any given game there is a chance that anyone can win, but in the long run the more skillful player will always end up winning more. ...

I very much agree with you here...

Quote from: dynadan on October 29, 2010, 23:43

In regards to your handicapping idea: I am neither for or against the idea in principle. I think this is definately an idea that could at least be debated for Mule2. But I think trying to do this to the original version would be opening up a new can of worms. I would enjoy seeing some specifics on what you had in mind.

Basically, simple ideas like giving lower ranked players more money and/or goods to start; more time for turns (greater total time and each unit of food gives them more time)... very much in the spirit of the classic beginner's species.

To all: Please don't feel like you can't join the discussion because the recent posts have been replies between dynadan and me. I would love to hear interesting ideas and thoughts.

edit: many changes -- most recently I added, "Quick aside: ELO, glicko, and TrueSkill are all popular skill-based ranking systems."

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on October 30, 2010, 23:51

I've made many changes to my above post... it's worth rereading if you already had.

Title: Re: Experimental Ranking System
Post by: dynadan on November 01, 2010, 06:58

I am not an expert on rating systems, but as far as I know ELO is specifically used in 2 player games. I did read that article you listed before i posted my last post but since I was unfamiliar with the game it was citing I didn't take very much away from it.

Again I am on board with most of the things on your wish list, but I have no idea how to stick them in the formula in a fair way. The one thing that I have a problem with on your list is the number of games has no direct relation to rank. This will make any system very easy to manipulate. Even with the TrueSkill system (where it takes 50-100 games to build rating) it is obvious we have problems that should be addressed. Thus leading me to my next point.

There is obviously something wrong with the TrueSkill system. But i think putting our heads together we should be able to solve it. Here is the problem....look at the top 20 players in ranking.

1 288 kipley 229 220 137 62 % 11 46693 158401 7305561
2 279 piete 155 147 91 62 % 34 81240 195258 5090293
3 271 dynadan 364 356 183 51 % 57 67558 196567 10309267
4 265 Rhodan 583 577 271 47 % 64 50143 167668 16867586
5 258 DandyDan 185 182 78 43 % 21 55363 167668 5371988
6 257 BaronHelix 100 93 49 53 % 39 68447 188880 3035048
7 256 maskdbandt 51 48 31 65 % 13 66541 181512 1629263
8 256 Mute 227 226 99 44 % 23 41720 156171 6177059
9 253 mountainwampus 84 79 43 54 % 10 48858 150664 2471496
10 253 fever66666 124 120 59 49 % 14 43963 164268 3645848
11 249 Wahnsinn 313 308 121 39 % 30 44123 196567 8656907
12 249 Mikusch 302 297 114 38 % 29 48455 156780 8538752
13 249 rodz 687 674 303 45 % 67 45043 196567 18557308
14 248 WhosYourBuddy? 268 267 122 46 % 39 73187 198412 7665367
15 245 Gunnar 87 82 48 59 % 27 108612 217391 3015259
16 245 cyounghusband 66 63 37 59 % 14 73988 189719 2048018
17 244 Bumbes 394 389 153 39 % 45 51203 161334 11002284
18 242 Govt Mule 92 88 44 50 % 5 41610 157261 2660026
19 241 dude2005 101 89 54 61 % 15 64364 214110 3272062
20 240 UnMortal

half of the players on there i don't think should be there, or are no longer active/relevant.
i.e. Baronhelix, MaskdBandt, mountainwumpus, fever66666, Gunnar, cyounghusband, Govt Mule, and dude2005.
The problem is they are all inactive players. They also do not have very many games (between 50-100). All of their games were played when being the host had very HUGE advantages that I am sure were abused. Almost all of the players on my list hosted well over 50% of their games. They also got all of the rating points playing early on before the field had really been established.
Take Maskdbandt as an example....he is ranked number 7 all time, he played 51 games total (48 legally) of those he hosted 31 of them. His last played game was january 28th.

So there's the problem i have with the current ranking system, and hopefully we can think of a solution. I just don't see how you can have a system that has baronhelix and maskdbandt in the 6 and 7 spot and have Brad1867 not even in the top 20.

I know C64Nostalgia has come up with some good suggestions he wanted added into the ranking system, and while i think they would be good additions, I realize how much work it could turn into for the programmers and they would also not solve the problems i just described with the top 20 (and the entire list, I just happened to study the top 20).

So here is my solution again, now that I have described why we need it in a better way. This should be fairly simple to add onto the trueskill formula. In fact it shouldn't be used with the formula at all, but should only be done after the fact to show a more accurate list. If players decide to come back from an absence they can regain their points fairly fast.

SUGGESTED ADDITION
For every week of inactivity drop the players rating by 2 points. 2 points is not a lot but over time if a player stops playing it will drop him off of being relevant on the leaderboard. These points should be dealt with separately than the TrueSkill rating points, and if a player returns from a long absence they should be able to earn them back at twice the rate they were taken away. (example: if a player does not play a game for 10 weeks he will have lost 20 points, if he then plays a game he will get 4 of the points back. The next week if he plays 5 more games he will still gain only 4 more points back. If he plays at least 1 game a week for 5 weeks straight he will have regained the entire 20 points he lost due to inactivity.) This will help to give the current batch of players the ability to climb the leaderboard, and prevent people that only played in the 1st month or two that planetmule was created and haven't played since from being eternally in the top 20.

I would be happy to hear other people's ideas on the subject. I am not married to my idea, it was just the best and most fair addition I could come up with. Resetting stats to zero is also a possible solution, except we will just have the same problem again in 6 months. And it really would be a shame to get rid of all those recorded games.

edited: Dropped the "E" bomb.

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on November 01, 2010, 08:24

A new addition to my ideal ranking system:
Slight penalty for hosting (dyandan gave me this idea)

TrueSkill is a matchmaking (ranking) system designed for Xbox Live. Microsoft wanted it to apply to as many games as it could. According to Wikipedia, "150 Xbox games use TrueSkill" -- games with hundreds of thousands and millions of players. The general-purpose nature of TrueSkill makes it, by definition, not suited to M.U.L.E. TrueSkill isn't necessarily unsuited, but it definitely doesn't reflect most of specifics that makes for a good win and thus a good M.U.L.E. player.

From TrueSkill's home page (http://research.microsoft.com/en-us/projects/trueskill/default.aspx), "the TrueSkill ranking system can identify the skills of individual gamers from a very small number of games." Microsoft states for a 4 player game: as little as 5 games are needed to identify a skill level. The maximum is still only 15. So, when I stated "Number of games played has no direct relation to rank," I wasn't including the uncertainty part of a ranking system. The uncertainty part is indirect. The main reason I brought up "number of wins" is it is one of the chief problems of the current ranking system.

The very simple fix to the stale-player-in-the-top-20 problem is reseting the ranks every major version/gameplay change or every specified amount of time. Then, allow us to see all the old ones, and compile other great-player-type boards from them. Regardless, a ranking data reset for whatever replacement is almost mandatory anyway -- so much of the old data is effectively corrupted with bugs and gameplay changes from old versions.

I maintain we need our ranking system to use more information from our games. If this can be added to TrueSkill, all the better. Although, I still think TrueSkill has inherent problems when used on Planet MULE. I plead for better solutions... Please math gods, Help Us.

Title: Re: Experimental Ranking System
Post by: dynadan on November 01, 2010, 11:27

I considered punishing hosts, but it would just lead to fewer hosts and fewer games. The developers have done a fantastic job removing the host advantages. There are no longer any actual play advantages for the host. So I don't think i can endorse a system that punishes hosts.

Maybe you guys are right about resetting the stats. I just thought there could be a more elegant solution to phase out the older games. It's not that they were unfair or meaningless games, everyone was playing under the same rules. But since a lot of smurfs have appeared lately maybe people are ready to start over anyway.

I am not sure colony score is a good indicator of how good a mule player you are. The best players I know are the ones that can choose the side of things that will give them the biggest advantage. Often times the best players are fighting to keep the price of ore down rather than up. In fact 4 bad players are much more likely to have a high scoring colony than a game with 4 very good players.

Margin of victory is still probably a good stat, although it may be easiest to manipulate by cheating. In games with huge margin of victory you can almost guarantee there was some (sock)monkey-business going on. Maybe cap the amount you get credit winning for at 5k? This would certainly change the way i play the game if this stat was used. I tend to try not to handicap a player if i think i am going to win anyway.

I just had a thought....Maybe instead of a ranking system you could come up with a format for a tournament using the stats that you wanted. It would help solidify in your mind what is important and start helping to work out the bugs. Maybe a tournament is a good idea anyway whether or not you use a new system at all. Sounds like fun.....maybe a double elimination bracket?

Title: Re: Experimental Ranking System
Post by: piete on November 01, 2010, 12:03

Sorry, only a quick comment, too busy at work...

First versions high scores were obtained with non-spoiling smithore, therefore at least they should be reset after every version change.

Victories obtained then are still victories, and you can never tell when the playing field is totally established. I got the first position by playing total of 50 games, and by that time I had played the most games of all! And it really felt like I had played a lot. So in my opinion, only decrease their weigh, you don't need to remove them. Or then reset the victory scores after every new version, too. Not a big deal anyway for me.

When we talk about colony high scores, at least in the original game it is very difficult to drive the score a lot more higher by a co-op play than by a good competitive game. And if you get it, it would only be a prove that socialism works better than capitalism (at least in the Mule universe ;) ). Cannot tell if this version is equally balanced since I never played a co-op game here. Anyway, in this sense, colony score is a good indicator of at least something.

Title: Re: Experimental Ranking System
Post by: Mt-Wampus on November 01, 2010, 14:11

I have a easy solution for the rankings system. DONT HAVE ONE!

Title: Re: Experimental Ranking System
Post by: Mt-Wampus on November 01, 2010, 14:18

Quote from: dynadan on November 01, 2010, 11:27

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on November 06, 2010, 08:58

I noticed the new "Ranks" link on Planet MULE's main navigation bar. Along with it, the prototype ranking system appears to be continuously updating... I believe the developers have proceeded with their plans using TrueSkill as the basis for Planet MULE's ranking system.

TrueSkill's design allows itself to be used with a wide variety of Xbox Live games. This ability stems from almost singly counting wins and losses. However, MULE is more than just wins and losses. Thus, I suggest adding some MULE specific data to enrich the ranking results.

My idea starts by letting TrueSkill behave as it normally does, but before players' skill values are adjusted after a game, a step is added. This step is a skill point modifier (SPM) consisting of the unique things that make a M.U.L.E. win or loss special. The SPM is applied to the individual player's skill point change before it adjusts their skill value. Let me start listing and explaining the factors making up the SPM.

1.6x to the winner's SPM. This gives strong weight to a first place finish. Winning a game of MULE means taking first. First ought to be counted as being more important. The cliché of second is first loser is very apt for MULE.

0.2x (max) is added to the winner's SPM giving weight to the margin of a win. How much of 0.2 is calculated by taking the difference between first and second's score, dividing that by the second place's score, and finally multiplying that by 0.2.

0.2x (max) is added (or subtracted) to all player's SPM's based on colony scores. Using the colony achievements as a guideline, 0.2x is awarded for the highest achievement, a score of 120,000+; no award for the middle achievement, 60,000 to 79,999; and a penalty of 0.2x for lowest achievement, <20,000. The in between achievements use 0.05 and 0.1, respectively.

0.015x penalty goes to the host. This component is to compensate for having the best pings. While changes to the game have made for much less of the host advantage, it still exists. Notice the very small value of this component.

Implementing my suggestion would give meaning to wins and colonies (and help compensate for the privilege of hosting). The implementation would also help make the ranking system reflect specific qualities showing truly better MULE players.

edit: If I've been unclear or something doesn't make sense above, please feel free to point it out. I will try my best to explain my suggestion better...

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on November 06, 2010, 20:50

I wanted my Skill Point Modifier idea to be self-contained in its own post. But, I wanted another post to add my other suggestions and thoughts. I will first lay out my full desire. Then, I will provide a compromise. (I apologize if some of the following is repetitive from other posts. My only excuse is the following is that important, and I'm trying to consolidate my positions in this and the previous post.)

Reset the ranking data This is a longstanding request of many players. Old game data is derived from many long outdated versions of Planet MULE. Versions that included countless bugs and significant changes in game mechanics. The old data will only negatively affect the integrity of game data from the current version of Planet MULE. Not only that, but resetting the ranking data will better show the best players because the rankings will reflect game strategies evolved from hundreds of games. (Resetting the rankings every major bug fix and/or gameplay change would continue to honor the best players of tomorrow's Planet MULE, as well.)

At the very least, reset the rankings starting from the last update. This game data is relevant and reflective of current strategies within the gameplay of the current game. Keep the old data displayed somewhere, but do not let it taint the new rankings.

Make games that start with AI's unranked I'm so very happy serious players as of late rarely play games that start with AI's. Earlier in Planet MULE's history, it was fairly common. This "botless" phenomenon is truly a marvelous evolution. With good reason, players avoid the AI. The AI is painfully laughable (My apologies to the guy who tweaked the AI's. My statement has nothing to do with him personally.). The old 1983 AI's play better. The ranking system should not reward wins using a predictable and bad player. AI's are worse than new players because they are extremely easy to game. They always play similarly. Furthermore once you learn their patterns, they become more like colluders and much less like real players.

At the very least make the skill value for an AI, the same as the lowest ranked player, currently 57 -- relatively accurate. Games that start with 2 AI's should definitely not be ranked. Again AI's are horrible players, and more importantly, doing this would also eliminate the ruse of players who habitually play 2 bots and a new player.

Title: Re: Experimental Ranking System
Post by: dynadan on November 07, 2010, 11:18

Quote from: C64 nostalgia on November 06, 2010, 08:58

My idea starts by letting TrueSkill behave as it normally does, but before players' skill values are adjusted after a game, a step is added. This step is a skill point modifier (SPM) consisting of the unique things that make a M.U.L.E. win or loss special. The SPM is applied to the individual player's skill point change before it adjusts their skill value. Let me start listing and explaining the factors making up the SPM.

1.6x to the winner's SPM. This gives strong weight to a first place finish. Winning a game of MULE means taking first. First ought to be counted as being more important. The cliché of second is first loser is very apt for MULE.

0.2x (max) is added to the winner's SPM giving weight to the margin of a win. How much of 0.2 is calculated by taking the difference between first and second's score, dividing that by the second place's score, and finally multiplying that by 0.2.

0.2x (max) is added (or subtracted) to all player's SPM's based on colony scores. Using the colony achievements as a guideline, 0.2x is awarded for the highest achievement, a score of 120,000+; no award for the middle achievement, 60,000 to 79,999; and a penalty of 0.2x for lowest achievement, <20,000. The in between achievements use 0.05 and 0.1, respectively.

0.015x penalty goes to the host. This component is to compensate for having the best pings. While changes to the game have made for much less of the host advantage, it still exists. Notice the very small value of this component.

Implementing my suggestion would give meaning to wins and colonies (and help compensate for the privilege of hosting). The implementation would also help make the ranking system reflect specific qualities showing truly better MULE players.

edit: If I've been unclear or something doesn't make sense above, please feel free to point it out. I will try my best to explain my suggestion better...

I was wondering if it was possible for you to show us an example of how you would work the numbers? I would very much like to see the difference for wins and losses from a 5 game sample using the equation you had in mind next to the same sample with just the TrusSkill system. I think the margin of victory is a good idea, but i am not quite understanding your formula for that one.

Also I think you should just take out the host negative factor, a lot of votes will go against that. And we really don't want to have less hosts on planet mule. The land auction still isn't perfect, but other than that there is no more host advantage anymore.

Quote from: C64 nostalgia on November 06, 2010, 08:58

Reset the ranking data This is a longstanding request of many players. Old game data is derived from many long outdated versions of Planet MULE. Versions that included countless bugs and significant changes in game mechanics. The old data will only negatively affect the integrity of game data from the current version of Planet MULE. Not only that, but resetting the ranking data will better show the best players because the rankings will reflect game strategies evolved from hundreds of games. (Resetting the rankings every major bug fix and/or gameplay change would continue to honor the best players of tomorrow's Planet MULE, as well.)

At the very least, reset the rankings starting from the last update. This game data is relevant and reflective of current strategies within the gameplay of the current game. Keep the old data displayed somewhere, but do not let it taint the new rankings.

Make games that start with AI's unranked I'm so very happy serious players as of late rarely play games that start with AI's. Earlier in Planet MULE's history, it was fairly common. This "botless" phenomenon is truly a marvelous evolution. With good reason, players avoid the AI. The AI is painfully laughable (My apologies to the guy who tweaked the AI's. My statement has nothing to do with him personally.). The old 1983 AI's play better. The ranking system should not reward wins using a predictable and bad player. AI's are worse than new players because they are extremely easy to game. They always play similarly. Furthermore once you learn their patterns, they become more like colluders and much less like real players.

At the very least make the skill value for an AI, the same as the lowest ranked player, currently 57 -- relatively accurate. Games that start with 2 AI's should definitely not be ranked. Again AI's are horrible players, and more importantly, doing this would also eliminate the ruse of players who habitually play 2 bots and a new player.

Since they are apparently keeping the high scores page, you have convinced me that resetting the stats may be a good idea.

I wouldn't make AI games completely unranked, but maybe a simple way to do it would be 1/2 value for all rating points for 1 AI games and 1/4 points for 2 AI. On the current system this seems about right to me, but I am not sure how it will mesh with what you have in mind. I am not exactly sure how TrueSkill is handling the AI's right now, it almost seems like they just leave them out. if thats the case maybe leaving them out and just sticking a modifier on would be the way to go.

I am not against anything you have in mind, I would just like to see some more numbers so we can discuss the weighting.

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on November 08, 2010, 03:48

My SPM suggestion and my Reset/AI Unranked suggestions both concern the ranking system. However, they pertain to separate aspects. So, I will reply in separate posts in an effort to not confuse the two.

Quote from: dynadan on November 07, 2010, 11:18

Quote from: C64 nostalgia on November 06, 2010, 08:58

Cool, note dynadan and I agree resetting the stats is a good idea.

First, I want to make sure everyone understands when I refer to AI games, I mean games that begin with AI players. I am not talking about games where an AI fills in for a player (timed out, bailed, router crash... et cetera).

I looked at akire1's (TrueSkill) ranks page. Most recently, when there are 2 AI's and 2 humans, the skill point changes tend to be 0.2 or 0.3 (If you go further back to older games, the change tend to be be higher but generally under 0.5.). For the higher placed human the change is positive and negative for the lower placed player. This seems to happen regardless of where the AI's place. The AI's appear to be placeholders more than anything else. I'm sure the developers could explain how they are currently treating AI's better than I can.

Nonetheless, AI's are horrible and predictable players. I'm not sure why you think games that start with AI players should be ranked. Maybe you could explain why you feel games that start with AI's are important to demonstrate a player's skill.

The reason I stress 2 AI and 2 human player games be unranked is because of the abuse repeatedly demonstrated by certain players. Taking a new player and using the AI's to dominate them should not be rewarded. Plus, with 2 AI's the dynamic of a game is very simple and consistently predictable -- hardly an arena to show high level skills.

That leaves games that start with one AI. Since AI's are horrible and predictable players. They should have horrible skill values. Players should feel pain when they lose to an AI. Another way, I suppose, is to treat the AI like a real ranked player. Then, it will receive a skill value according to its talent. To simply ignore an AI makes them more meaningless than they already are.

Title: Re: Experimental Ranking System
Post by: Rhodan on November 08, 2010, 04:34

Quote from: C64 nostalgia on November 06, 2010, 20:50

Reset the ranking data This is a longstanding request of many players. Old game data is derived from many long outdated versions of Planet MULE. Versions that included countless bugs and significant changes in game mechanics. The old data will only negatively affect the integrity of game data from the current version of Planet MULE. Not only that, but resetting the ranking data will better show the best players because the rankings will reflect game strategies evolved from hundreds of games. (Resetting the rankings every major bug fix and/or gameplay change would continue to honor the best players of tomorrow's Planet MULE, as well.)

At the very least, reset the rankings starting from the last update. This game data is relevant and reflective of current strategies within the gameplay of the current game. Keep the old data displayed somewhere, but do not let it taint the new rankings.

Agreed.

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on November 14, 2010, 00:31

Quote from: GambitTime on October 16, 2010, 09:38

Quote from: GambitTime on October 16, 2010, 10:04

I did a very quick reread of this thread -- skipping dynadan's and my posts. While dynadan and I have posted at length in this thread, I wanted to bring back a couple of posts I really liked by GambitTime.

In agreement, M.U.L.E. is about winning! Wild risks for a play at first founder can make M.U.L.E. exciting. Furthermore, giving an incentive for a margin of victory gives the player in first a reason to take risks, too. Then, to balance that margin and give everyone a chance at a boost, a colony award is offered. This is my basis for almost all of my SPM.

A quick rerun through my skill point modifier (SPM) idea. The SPM is multiplied to the skill point change determined by the ranking system after a game, but before the skill point change is added to the player's skill value. (These exact numbers are somewhat arbitrary, but they serve as a general guide.) :

0.6x to the winner's SPM

0.2x (max) is added to the winner's SPM giving weight to the margin of a win. How much of 0.2 is calculated by taking the difference between first and second's score, dividing that by the second place's score, and finally multiplying that by 0.2.

0.2x (max) is added (or subtracted) to all player's SPM's based on colony scores. Using the colony achievements as a guideline, 0.2x is awarded for the highest achievement, a score of 120,000+; no award for the middle achievement, 60,000 to 79,999; and a penalty of 0.2x for lowest achievement, <20,000. The in between achievements use 0.05 and 0.1, respectively.

The 0.X factors were not explained well. The SPM always starts at 1x. The 0.X parts are added or subtracted. So, a win with the worst colony penalty becomes 1+ 0.6 -0.2= 1.4x. Or, a 2nd, 3rd, or 4th with the best colony award becomes 1+0.2= 1.2x for positive skill point changes and -1+0.2= 0.8x for negative skill point changes.

Another way to think about the SPM is by moving the decimal 2 places, thus everything becomes a percentage. 1x becomes 100% of the skill value change. A win, 1.6x becomes 160% of the skill value change. For positive skill point changes, one wants the highest percentage for the biggest upward movement. For negative skill point changes, one wants the lowest percentage to minimize negative movement.

The last bit about a hosting penalty... While I applaud the developer's efforts to mitigate "host advantages," a host still gets the best pings. Their side of their games will be the most responsive. Additionally, if the lag is big enough (as little as 250-300), their low pings can overcome the host advantage mitigators. For example, a higher-placed host can steal a plot from a lower-placed player. The penalty is small, but serves a noble purpose -- it tries to make things a bit more fair.

0.025x penalty goes to the host.

The problem with being able to give results from a 5 game sample is the developers are the only people who know Planet MULE's actual equations and variables in calculating the leaderboard. I don't know how a SPM will affect sigma values and player skill values (mu) over many games. Will the SPM make a meaningful difference? Will it muck up the uncertainty portion (the sigma stuff) and create strange and unreliable results? I just don't have the needed information (and potentially the skill and/or tools) to do the statistical samples, but I am very curious and hopeful.

[edit: proofing, more clarification, and added the percentage analogy]

Title: Re: Experimental Ranking System
Post by: C64 nostalgia on November 14, 2010, 01:05

Quote from: Mt-Wampus on November 01, 2010, 14:18

Punishing hosts ? Guys come on Planetmule and are begging for somebody to host a game so they can play and your talking about PUNISHING hosts ???? Talk about ungratefull. Slap in the face to all legit hosts of the past who have been kind enough to fire up a game.

The primary reason why people beg for hosts is because not enough people know how to port forward... Learning how can be time-consuming, frustrating, and for some, virtually impossible. Please do not confuse a lack of hosts with adding more fairness to the rankings.

The reason I advocate a cost for hosting is because I've seen (and played with) the benefits a player gets by hosting. I want games to be fair. It's that simple. Last, more often than not, the times I've seen someone choose not to host (assuming a reliable game can be set-up) is to give an advantage to a weaker player or a player with high pings.

[Part of the above was cut from my previous post. Players with Macs check out: A simple way to Port Forward on a Mac -- Port Map (http://www.planetmule.com/forum?topic=686.0). Maybe it can help you with your hosting woes.]

[edit: proofing and clarifying]

Title: Re: Experimental Ranking System
Post by: data2008 on November 14, 2010, 09:53

@C64-Nostalgia:
very good and thoughtout ideas!
We will think of implementing something like "SPM"s in the near future.

Title: Re: Experimental Ranking System
Post by: Peter on November 15, 2010, 14:20

Thanks for all the feedback.

Quote

I'm sure the developers could explain how they are currently treating AI's better than I can.

To clear this up. AI opponents are removed when the ranking is updated. A game with 2 humans and 2 bots will be treated as a 2 player game. It doesn't matter where the AI's are placed.

Games with 2 humans are weighted by 10%
Games with 3 humans are weighted by 60%
Games with 4 humans are weighted by 100%

Weighted means that the change in the players mean skill value and uncertainty is reduced to the specified percentage.

For example if a player has a skill mean of 25 and uncertainty of 8 which should be updated with +2 mean and -1 uncertainty, and it was a 2 player game, then the change is reduced to +0.2 and -0.1.

Like C64's SPM idea, it's of course possible to weight the change in skill by any factors. C64's modifiers will mainly give you less of a penalty if you lose games and make 2nd place less valuable.