In general, rating systems where the top vote is the "normal" vote are fucked up. There's always noise in how each party perceives the transaction. As the supplier, the only thing you can do is kiss the ass of everyone in hopes of placating the overly negative ones. And as a buyer, you're unable to reward truly above-and-beyond service.
Arthur C. Clarke wrote a great essay on the challenges of rating systems, in "The Servant Problem - Oriental Style" (included in The View from Serendip: http://www.powells.com/biblio/2-9780345271082-1)
It's a bit more nuanced than this, but the fundamental dilemma is:
• An overly negative review essentially dooms someone to never working again.
• An overly positive one leads to sticky questions from the next person to hire them (whom you probably know socially).
Clarke's solution is to write ... very closely ... an accurate but difficult to parse recommendation. As I recall, the essay ends with him noting that a household servant he'd dismissed some years before (pursuing a "flim" career) had since returned, to the pleasure of both parties.
Sadly I can't find a copy online -- seems that at a $1.78 purchase price the friction of commerce is excessive here for a 50 year old essay.
Totally agreed. I also think Uber should be slightly more explicit in their ratings screen (e.g. Bad Experience, Good Experience, Exceptional Experience)
If someone has a problem with a ride, the only thing they really care about is letting Uber know that the ride was terrible (I bet the rating distribution clusters heavily around 1, 4 and 5 stars).
Looking at passenger rating histories should also play a role in how these are interpreted (if someone is an asshole and rates tons of drivers poorly even if they're well liked, that should be discounted)
By doing something like this, you have a way of weeding out bad drivers, rewarding great drivers, and leaving everyone in the middle alone.
Sidecar does something like this, They ask if it was bad, good or exceptional. If you say good, they ask what could've been improved -- and I usually find myself selecting "nothing in particular"
That gives a very accurate picture of the driver. It's cool that sometimes my sidecar driver gives me candy and tells jokes, but that's not why I use the service.
I expect a driver to get me from A to B, and notice whether I'm interested in talking (in which case talk and be friendly) or not interested in talking (in which case don't talk please).
A half-pint of water is nice but hardly necessary - it won't convert a horrible ride into a great one, and its absence doesn't make a great ride a bad one.
So much is outside the control of the driver - a late night drive with no traffic is generally more pleasant than a sluggish rush-hour drive, but it's not the driver's fault.
I once took a longer Uber Black ride to pick up my wife a few towns away, who was quite sick. Once I mentioned this to the driver, he drove noticeably more 'efficiently', offered me his phone to call to check on her once mine had died, and waited outside the building briefly to make sure everything was ok.
Also, I've got a bit of prior race driving experience, and this guy clearly grokked car control. I wouldn't be surprised if he'd spent a good bit of time on a track. He also, without drawing any attention to it in any way, adjusted the suspension settings from 'firm' when we were going around on- and off-ramps to comfort on the highway, to keep the ride comfortable and level.
I gave him a five-star rating and sent Uber customer support an email raving about him. Wish I could have done more.
I think it this case you did it right with a letter to customer service.
When the "expectation" is a 4/5 star rating, then the average is generally 4.6-4.7 stars and there's no way to distinguish "normal, good service" from truly "above and beyond".
Additionally, if the "above and beyond" just gains a 5 star review, that doesn't really give the feedback needed for extraordinary service.
In my experience with cab drivers if they don't offer unsolicited conversations about religion or politics that's an exceptional experience. When I say conversations I mean they just drone on and on one sided about how great God is and how Obama is ruining the country.
I'm curious what would happen if they asked a simple y/n question: "Was there anything negative about your Uber experience?"
Of course, this means you can't really reward drivers who are amiable and make you feel comfortable. But I feel as though answering "no" to the above question implies that your ride was a at least a "4."
I think three tiers of experience make more sense anyway. Your ride was either Bad, Good, or Exceptional. It's difficult to gauge what a 2/5 or 3/5 means.
But that misses their own threshold. Rate 4 and you're not asked a thing (if I understand you correctly), but the driver might be kicked out if enough people do the same, right?
If there 4.5 is the minimum average rating, everyone not rating 5 should be asked what was wrong. Or .. fix the rating so that not everything needs 5 out of 5.
I think ideally the ratings should be done by machine learning instead. Rather than returning an average of previous ratings, return a prediction like "there is a 50% chance you will like this and 10% dislike it (40% chance you don't rate it at all.)
This way a small sample size doesn't distort the rating too much, and predictions can be customized for every individual, and it's somewhat resistant to fake reviews, if they can be picked up on by the computer. It also gives you an incentive to rate and what you actually think.
A variant of the newer Reddit rating system (Wilson score confidence interval) would be a good way to attack this without the complexity of machine learning. Effectively, a statistical sampling over time that builds confidence that a particular driver's rating is accurate. This would work well with their 40 initial trips policy and you could then just cull those who you have strong confidence are poor drivers. Sigma bounds over the whole population of drivers would also tell you what a "reasonable" rating is which would float over time.
That would work, though it's less than ideal and machine learning, is pretty easy. The main advantage is you can get very accurate predictions on what the next rating will be, and you can easily add in more arbitrary data to make the predictions more accurate. This is the goal of ratings after all, to estimate the probability that a service will be good or bad.
Or you can just as easily phrase this the other way around. If you were completely satisfied by your service, why should you give less than the highest rating available? Are you just subtracting stars because the driver didn't do things which you wouldn't expect the driver to do?
It seems pretty stupid to have a multi-point rating scale that doesn't distinguish between the driver that gave you polite, timely and safe service and the driver that did all that but additionally gave you fantastic advice about the city and insisted on carrying your luggage up a flight of steep steps at the end.
It's even worse to have a points scale where not awarding the first driver the same as the second moves him a few percentage points closer to losing his livelihood.
I disagree with your first point. I still think it's silly to subtract points from the maximum simply because a driver didn't do some nice things which he or she shouldn't be expected to do, like carry your luggage or give you a million dollars.
You can partially infer something from the lack of ratings. People usually only do them to give good or bad ones, so you can balance it out a bit with non-voters as "average" scores.
Not voting is nearly impossible on Uber though. As soon as you open up the app it prompts you. Often I forgot the ride and just press 5 stars unless something was exceptionally bad.