Andreas Weigend Data Mining and E-Business: The Social Data Revolution
STATS 252, Stanford University, Spring 2009
Class time: Monday 2:15 - 5:05 pm
Class location: Gates B01

What is statistics about? It is the science that deals with noise, and generalization, which is is trivial if there is no noise. If there is no noise, all you have to do is do a next-neighbor look up, then linear interpolation and you have what you want. What we really want to do is find some way to mottle the noise in a noisy environment so that we have an expected value, etc.

Data

We'll learn about the power of data and meta data (data about data--for example, in communications, the attached importance of e-mails from sender to recipient constitutes the meta data)

Dead data--you can’t influence anything, for example, in the Netflix contest.

Live data--do experiments to collect information about other people.

We then have to design incentives so that people create and share data with us. We can do this now because communications is very cheap now, in terms of storing data, indexing it, distributing it within the network. We also have to bear in mind the hidden costs involved in collecting data, like the cost of annoying people? In the case of our Facebook pages, using "interrupt and repeat" marketing methods may cause users to unsubscribe. What then, is the costs of unsubscription. In the case of Amazon, what is the cost of people calling the call center? If customer service does its job well, it may be a positive benefit when people feel that they'd been taken care of.

Instrumenting the world

We all carry detection devices--mobile phones, phones with accelerometers. How can we use them to get data?

Sources: unobservable, surrogate (online, mobile, offline), geolocation. The sky's the limit, compared to 20 years ago.

Multiple scales: timescales (seconds vs years), granularity (fine vs not fine)

Financial markets are where scales can be particularly important. Consider what one person from GMO (investment management firm) said: By now, the determining factor between hedge funds is literally how far the computer physically sits from where the execution happens. A difference of a few meters determines who moves the market first.

Metrics

If you globally substitute "metrics" with variables, you are on the right track; if you globally substitute "metric" with "return" or Key Performance Indicator (KPI), you are not. Look at distribution before looking at the mean--keep in mind the flaw of averages.

Don’t just report some number but instead, look at how things change, the slope, the outliers.

Real time (event driven) data--taking data as they come up right now, for example, a new post increases number of posts by one.

Batch mode data--done once a day, which makes it harder to figure out when things happen and what went wrong.

Interactive aspect data--you can change things as you do data analysis.

Recall from two classes ago, the contrast between Art (care about the product, don’t care how you got to it) and Tools (focus on process, final product doesn't matter)

Lessons from previous homeworks: Start earlier. Spend more time looking at successful pages. Solicit feedback. Think about what do people do after looking at your page.

Experiments

Do A/B tests, test early and often. Always collect data from the very beginning--instrument right at beginning and see what works and doesn’t work! Build lo-fi prototypes, stick to the simple stuff and don’t just work on the one big thing.

Applications

4 generations of recommendation systems

Psycho, demographics

Purchases, transaction-based

Social, identity-based, eg. Facebook

Situational recommendations

Relevance

How do avoid information overload and determine the order in which items appear on a webpage? Facebook and Twitter lists the most recent item first. But is this the best way? Perhaps people don’t want to give up control to a computer which tells them to read this over that? On the other hand, we don't see Amazon recommending the most recent item to arrive at the warehouse, or Google ranking results by recent activity.

What metrics can we use to determine relevance?

How often you click

How far down you scroll before clicking

What you skip

How often you come back

Re-tweeting, or sharing

"Like" feature in Facebook

Tip of the Day

On the Social Data Revolution Facebook page, if you want people to be able to comment on your links, click on the box, and click on Add Link. Do not enter url or html in the box.

Discovery vs. Search

We'll talk about this in the coming weeks.

Industry

Large/Established: We'll talk about Amazon, Google (see Bo Cowgill section)

Startups: Submit suggestions for speakers to Enrique.

Tools

We have to critically assess social media tools.

Facebook

Strength and weakness: relationships have to be mutually confirmed.

Data is binary: they are either there or not there.

Much richer data compared to Twitter.

Twitter

You are free to follow or not follow someone

Open: what is known is known to everybody.

You can mine data by understanding what questions are being asked.

Changes in user behaviour: How do you make changes/decisions based on data collected?

Vision & the future

(to be covered in future)

To find leverage, create data wherever your true passion is, eg. geolocation.

If you had all the data in the world, what would you do with it.

Next week: Facebook guy who does all the experiments and figure out which feature work.

Part 1: Decision Analysis

References

Ronald A. Howard, Decision Analysis Manuscript (Foundations of Decision Analysis) Ronald A. Howard, Readings in Decision Analysis Peter McNamee and John Celona, Decision Analysis for the Professional The first to references are the course readers for Professor Howard's DA courses. The third reference is by two consultants from SmartOrg www.smartorg.com, a firm founded by the other co-founder of Decision Analysis Jim Matheson. The book is available under the Creative Commons Attribute No-Derivatives Non-Commercial. Here is the pdf:

A decision is an allocation of resources that is somewhat irrevocable.

There is a cost associated with any decision: choosing this alternative means cutting away other alternatives that could have been chosen.

.

Robert Frost on his own poetry:
"One stanza of 'The Road Not Taken' was written while I was sitting on a sofa in the middle of England: was found three or four years later, and I couldn't bear not to finish it. I wasn't thinking about myself there, but about a friend who had gone off to war, a person who, whichever road he went, would be sorry he didn't go the other. He was hard on himself that way."
Bread Loaf Writers' Conference, 23 Aug. 1953

A good decision is different from a good outcome

We often hear people say things like "That foul turned out to be a bad decision ‘cause they lost the game.", "You shouldn’t have bought this from Amazon ‘cause it was broken on the way.", "I drove drunk but made it home safely, so what’s bad about it?", which represent a common confusion between Decision and Outcome.

A good outcome is desirable, but quality of a decision is determined when you make it, and by the logic behind it!

Characteristics of Distinctions

Distinction

A thought that separate one big group into two or more small groups

Clarity

A distinction has to be clear in order to be useful.

We use a clarity test to ensure there is no confusion in the communication of information.

Examples: the biggest online social network / not
a marketing success / failure
an active user of our page / not

Mutually Exclusive

At most one can occur

All elemental possibilities should be mutually exclusive

Collectively Exhaustive

At least one must occur

The collection of elemental possibilities is collectively exhaustive

Starting point - Decision Basis

Preferences (What we want)

More money, pleasure, attention, connections, power, rights...

Clearly money is not the only thing people are seeking in life.

Example: Professor Howard once offered a deal that goes to the highest bidder. The one who acquires the bid will call "head" or "tails" for a thumb tag. If he calls right he will get $200, but if he calls wrong he will get nothing (and pay his bid amount). What do you think the highest bid will be?
To everyone's surprise, the highest bid was $220! Professor Howard asked the student why he would bid such a high amount knowing that he could not make money out of the deal. "It's because I bid $220 that I could stand in front of the classroom to answer this question!" answered the student, "If I bid $180 there might be someone else bidding $190, and if I bid $190 there might be someone else bidding $199, so that I would never win such attention."

Alternatives (What we can do)

Examples: To accept / reject / ignore a friend request
To become / not become a fan of a page

Information (What we know)

Models

Probability assignments

There is no "true" or "right" probability. Probability assignments should be based on the background information of decision maker.

Risk Attitude

People react differently due to many factors:

Their taste for risk

Their wealth state

Example of deal: we flip an "unfair" coin, and we believe there is a 60% chance of it being head and 40% chance of it being tails. You will get $1 if it is head, or lose $1 if it is tails. How many of you are willing to accept the deal? - About half the class
What about if the amount is $100? - About 10
$10,000? - Only 3 Conclusion: Value of the deal isnot necessarily the expectation of the dollar amounts placed on the possible outcome(s)!

If you want to go further into risk attitude you can assess your risk tolerance with an exponential equation. In order to do so use the risk odds question.

Trees in Decision Analysis

Typically in decision analysis, we use a structure called "trees" to organize our thoughts and diagram series of possible outcomes. If several "branches" of the tree meet at a square, that indicates that those branches give different options for a decision which the decision maker is facing. If the branches meet at a circle, then the branches indicate possibilities for an uncertain event.

In our decision analysis class, we studied this tree, which models Hamlet's famous soliloquy: (read the text here, along with some common interpretations)

While trees typically are used for more quantitative situations, it is important to note that any decision situation -- even one such as this without apparent quantifiable elements -- can be modeled in a decision tree.

Five Rules

We must follow these five rules in order to use the decision analysis framework.

While some of these rules may seem trivial, they are extremely important, and obeying them can become a non-trivial matter in larger scale decision analysis problems.

Probability rule

Decisions can be modeled in terms of possibilities (the potential events that may occur) and probabilities (the chance of those distinct events occurring).

While it may seem very easy to assign a probability to the chance that an event will occur, it is often very difficult to zero in on a highly accurate probability -- doing so requires us to consider many factors, most of which are not obvious at first glance.

Order rule

Decision maker must be able to order possibilities in descending order of preference. Ties are allowed in the case where the decision maker is indifferent between two prospects.

This rule forces the decision maker to have a deep understanding of their preferences, which is essential in order for the decision analysis process to produce useful results.

Equivalence rule

If a decision maker is faced with three non-equal possibilities such that A>B>C, there exists probability p such that the decision maker is indifferent between receiving B for sure vs. a p chance of receiving A and a 1-p chance of receiving B

Suppose you are given three options: living your perfect life (labeled A), living your life as it is now (B), or an instant, painless death (C). Following the order rule, let's say you rank these prospects by stating that you prefer A to B, and prefer B to C. The equivalence rule states that there exists a probability p such that you are indifferent between living your current life for sure and participating in a deal in which you have a p chance of living your perfect life and a 1-p chance of an instant, painless death.

As with many situations in decision analysis, the difficulty arises in determining p.

Substitution rule

If the decision maker has assigned the same value to several deals, those deals may be substituted for one another.

Assuming a risk-neutral approach, we can find a value for a deal by finding the expected value of each outcome and summing over the prospects. In this case, the value for this deal is -$36 ( = 0.3(100) + 0.6(60) + 0.1(-1000) ). Once we know the value of this deal, this rule allows us to replace any deal which is also valued at -$36 with this deal. Additionally, If we had another deal which we valued at $100, we could replace branch A with that deal.

As you can see, this rule allows us to simplify trees -- for example, we can translate a large tree with many possible prospects into a tree which only contains a few of the original prospects (assuming the numbers work out correctly).

Choice rule

The decision maker must choose the deal with the highest probability of the prospect that he likes better.

This rule is probably the most intuitive, and also is the only rule that specifies which action you as the decision maker must take.

Simple Example: Should you send a friend request?

What is the benefit of sharing information on a social network? What are the costs?

Decision analysis requires us to assign monetary values to actions and outcomes which we may not typically think of in an economic way, such as sending a friend request on Facebook. Thinking in depth about the costs and benefits associated with such an action allows us, as decision makers, to create better estimates of the value we place on that action and its outcomes.

Clearly, we can place different monetary values on an action depending on the situation. For example, we may place a negative value on not sending a friend request to an individual who has posted photos of us or invites his Facebook friends to events we'd like to attend. However, we could value not sending the friend request at $0 if the recipient was someone we were indifferent about being connected to on Facebook.

How would you rank your value of these outcomes?

Once we have specified the value of each deal in question, this ranking becomes trivial to construct. However, creating a ranking before determining exact values often helps us to come up with and to evaluate those values.

How would you assign probabilities to these possibilities?

Again, thinking in depth about the situation allows us to reach better probability estimates for each possible outcome.

As with determining values for outcomes, assigning probabilities should be situation-dependent: in this example, we could increase our estimate of the probability that the recipient will accept our friend request if we already knew that person very well outside of Facebook.

Tree Flipping

Why trees? At their heart, they make Bayes' Theorem easy to remember. What is Bayes' Theorem? A quick reminder:

Bayes' Theorem allows us to easily find conditional probabilities using the following formula:

P(A|B) = frac{P(B | A), P(A)}{P(B)}.

Intuitively, the formula allows us to update our beliefs about outcome A once we have observed whether or not outcome B occurred. Here's a really cool example that appeared in an article in the Economist: (link via this page)

The essence of the Bayesian approach is to provide a mathematical rule explaining how you should change your existing beliefs in the light of new evidence. In other words, it allows scientists to combine new data with their existing knowledge or expertise. The canonical example is to imagine that a precocious newborn observes his first sunset, and wonders whether the sun will rise again or not. He assigns equal prior probabilities to both possible outcomes, and represents this by placing one white and one black marble into a bag. The following day, when the sun rises, the child places another white marble in the bag. The probability that a marble plucked randomly from the bag will be white (ie, the child's degree of belief in future sunrises) has thus gone from a half to two-thirds. After sunrise the next day, the child adds another white marble, and the probability (and thus the degree of belief) goes from two-thirds to three-quarters. And so on. Gradually, the initial belief that the sun is just as likely as not to rise each morning is modified to become a near-certainty that the sun will always rise.//

Let's start with a simple example:

In this tree, each of the values on the right (a joint probability) is the product of the two corresponding values on the left, ie P(AB) = 0.6 * 0.8 = 0.48. Note that the values for B and B' are dependent on whether or not A occurred. This tree gives:

Probability:

Value:

P(A)

0.6

P(A')

0.4

P(B|A)

0.8

P(B'|A)

0.2

P(B|A')

0.2

P(B'|A')

0.8

Looking at this table, you probably have noticed that we could apply Bayes' Theorem and obtain P(A|B) if we had values for P(B) and P(B'). This is where tree flipping comes in handy.

Here is the above tree after flipping. Note that the top and bottom joint probabilities remain in place, but the middle two swap places in order for the tree to flow correctly. All joint probabilities retain their values from the original tree.

First, we calculate the values for P(B) and P(B') by adding the joint probabilities which stem from the branch corresponding to P(B) or P(B'). For example, P(B) = P(BA) + P(BA') = 0.48 + 0.08 = 0.56.

Remember, by the rules of conditional probability, that P(B|A) = P(AB) / P(A). This means that we can rewrite Bayes' Theorem to say P(A|B) = P(AB) / P(B), by canceling out the P(A) terms. We then use this version of Bayes' Theorem to obtain P(A|B), P(A'|B), P(A|B'), and P(A'|B'). For example, P(A'|B) = P(A'B) / P(B) = 0.08/0.56 = 0.142857, and so on.

A note about probability terminology: in the original tree, A is the prior and B is the likelihood. In the flipped tree, A is the posterior and B is the pre-posterior.

Decision Tree Tools

If you're interested in using decision trees to model your decision process, check out these two free Excel plug-ins -- they'll perform all the calculations and keep your tree looking neat as you add nodes for decisions and events.

One of the past TAs from the decision analysis course made an open-source plug-in which can be found here. Note that it works best with Office 2003 for Windows, and works pretty well with Office 2007 for Windows. It does not work with Mac versions of Office.

Treeplan.com has a great decision tree plug-in called TreePlan -- you can download a free 30-day trial here. This plug-in works on Macs and PCs with versions of Office up through 2007.

Cost of Information

Decision Analysis gives us a framework to measure the value of deals that we face as well as to measure the value of additional information about uncertainties in our models.
Assume that you are running an email campaign for your company and the following is true: · Emails will be sent to 100,000 people · Response rate will be either 1% or 2% · There is a 60% chance that the response rate will be 1% and a 40% chance that it will be 2% · The payoff for a positive response is $30 · The cost of the email campaign to 100,000 customers is $0.40 per email

Calculate value without a test

This model is simple it has one decision, run campaign or not, and one uncertainty, the response rate. Using a simple model allows us see the framework on which we can build more complex models.

This tree calculates the value of this deal for a risk neutral person:

Since the "value" of this deal is positive, $2,000, you would go ahead with the campaign.

Perfect test

What is the value of knowing the outcome of the uncertainty before you make the decision? Assume that there is a clairvoyant that you can ask. The clairvoyant knows everything about all states past, present, and future. He does not know about probabilities however, so it is up to you to assign those according to your knowledge.
Clairvoyance: Test predicts response rate perfectly, i.e. is correct with his predictions 100% of the time. · How much should you be willing to pay for perfect information?

The following tree models the clairvoyant case. They clairvoyant tells you what the response rate will be before you make your decision.

Imperfect test

In the real word, test are less than pefect and in this case we get a test that is 90% accurate. In order to put this into our decision tree, we need to do tree flip in order to incorporate Bayes rule.
The reason that we have to do this is because the accuracy of the test is assessed by rating how well the test will go given an outcome. That is if the response rate wil be 2%, then the test says that it will be 2% 9 times out of 10. However, the order we need this to put it into the decision tree is test first, then outcome.
These two diagrams show the process:

Assess the likelihood that the test gives the correct indication

Flip the tree to put knowledge about the test before knowledge about the uncertainty

Put the probabilities into the decision tree

The value of a test with 90% accuracy is $6,600 - $2,000 = $4,600

Testing

In order for a test to have value it must be:

Relevant: the probability of the test indication given the outcome of the event must be different than the probability of the indication not given the outcome of the event or P(test|event) != P(test|not event)

Material: if the uncertainty is resolved, the decision maker would make a different decision

Economic: the cost of the test must be less than the value of the information that it will give you

Remember that the worst test has 50% accuracy, the same as flipping a coin.

Discussion

Framing is perhaps the most important topic in decision analysis. It addresses the questions

What is given?

What is to be decided now?

What can be decided later?

The importance of framing is that it decides which uncertainties are included in your model and which are left out. Sensitivity analysis can help to decide. Framing also deals with which revenue and cost values to include in your model and which to leave out.

Think about all costs that affect your value.

What is the cost of an unsubscribe?

Does that cost differ if that user is more connected in the social network or less connected?

What about other networks?

More on risk - not covered in class

How you measure risk attitude mathematically? Here I will show one of easiest methods, the risk odds question . Ask yourself at what value of r would you accept the deal to receive r or lose half of r. For instance if r is 100 the deal is a 50-50 chance of winnng $100 or losing $50. Keep raising and lowering the level of r until you reach the point where you are indifferent between taking or rejecting the deal.

Another way to think of is to suppose you pay r/2 to play the deal with a 50-50 chance of receiving r or receiving 0.

Once you have the value of r, you can calculate your risk aversion coefficient, gamma = 1/(r*1.04). This will be the coefficient that you use to convert your payoffs in your model into risk space with an exponential equation. The most common equation and the one that the Excel plug in uses is: 1-exp(-gamma*x).

Using this equation, convert the outcome values into risk space and do the calculation to the base of the tree in that space. When you get to the root node, convert it back into dollar space in order to evaluate the value of the deal. This will tell you which decision to take, given your assumptions.

This following decision tree calculates the value of the risk adjusted deal where r = $100,000 in the risk odds question above. Here the value is $985, quite a bit less than the risk neutral $2,000.

Take Aways

Decision Analysis provides a consistent logical method for evaluating decisions

It gives you the "value" of the a deal that you might acquire or sell

It lets you calculate the value of additional information

It frequently produces results that are not intuitive!!!

It requires a frame, the judgement to discern which variables should be included and excluded from your model

To provide an objective measure of something "everyone knows"

To create transparency

People try to investigate what is going in

To give attention to the projects that are important

Examples: How many (new) fans on FB page?
Will US average gas prices reach $2.50 by june 30?

Prediction Market definition and introduction:

General information: http://en.wikipedia.org/wiki/Prediction_market
A prediction market can be compared to a stock market for ideas or information. The market rewards good information whether it comes from elites or the masses. Prediction markets have built a track record of besting pundits and pollsters when it comes to predicting everything from political elections to quarterly sales figures.

What is a prediction market?

How can I use a prediction market in my business?
Also, refer to "Wisdom of the crowds" by James Surowiecki:
Under what circumstances is the crowd smarter?

Crowd needs to be diverse, so that people are bringing different pieces of information to the table.
Information elicitation: It is not always obvious where information comes from. Each person holds different information. eg. The janitor knows who stays at work the longest and who is pulling all nighters.

Crowd needs to be decentralized, so that no one at the top is dictating the crowd's answer.

Crowd needs a way of summarizing people's opinions into one collective verdict.
Information aggregation: Need mechanism to summarize what crowd really thinks. Eg. Price of the stock. Parallel to machine learning: Combining weak learners, boosting, etc. Prediction markets allow us to aggregate information from a huge variety of sources.

And the people in the crowd need to be independent, so that they pay attention mostly to their own information, and not worrying about what everyone around them thinks.Prevention of social influences.

Setting up the market

Iowa-like model
The University of Iowa set up the first prediction market to predict the outcome of the presidential elections in the U.S. in 1988.

http://inklingmarkets.com/homes/faq
A share is representative of a specific outcome and its probability or value of occurence as designated by the share price. A purchased share represents the confidence of the purchaser in the specific outcome designated by the share.

Endowment

If bundle pays off $100, give people $100, and then let them bid

Who should participate

Kick out who know too much? Kick out who knows too little? No, they provide liquidity and have others participate. Everyone can participate.
Empirical findings:
People at the top aren't doing better than people at the bottom, eg. CEO vs. people that are 7 degrees away from CEO.
Normalized by trade and by person. Each function will have their own biases.
Usually, most people's final holdings remain similar at the end of a quarter, though there are some people who do well and some people who really stink.

Prizes

The way that the prizes are designed to encourage participation make or break what the market is.

One option would be to give money to only the top people. However, in this scenario, there is a problem in that it makes the people with no chance of placing in the top indifferent. So, if you are not the top person you have no motivation to participate.

A linear or proportional pay off scheme could solve this issue. For example, people get one raffle ticket for every dollar they made (linear incentive scheme). But there is a problem in that people that do well may not be rewarded because raffle winners are random.

To encourage active participation, give $1000 out to the most active participants. Can also give out t-shirts to other pretty active participants.

Discussion--Liquidity: How to get people to participate?

Potential users have a bad experience if nobody else is hanging out there. Feedback, Lists
What is the effect to make prices public, e.g., in the internal market at a company (e.g., BestBuy, google) be public?
Sharing private information via prediction market is tricky; people are not comfortable with it. For example, what if there was a market on what Hotmail's market share would be after the launch of Gmail, a new, unknown product at the time.
Automated market bot shouldn't be discouraged as it also provides liquidity because it can be wrong.

Part 3: Bo Cowgill (Google) on Prediction Markets

Traditional Methods of Forecasting and Their Shortcomings

The "Smartest guy in the room" method: have your best expert (or best computer program made by an expert) make your forecast.

The problems

But the smartest guy doesn't have all the relevant info. Only the janitor knows who stays late working.

The "smartest committee in the room" method: assemble a team of your best experts and decide by vote

The problems

(1) No proper aggregation of votes.

(2) One person, one vote rule gives you no way to indicate how confident you are (You can't cast 2/3 of a vote).

(3) Physical limits to how many people you can put on your committee.

Characteristics of basic prediction markets

Market Design Considerations

Integrity

Make clear rules about whether or not your outcomes "happened"

Keep out "insiders" who can personally affect the outcome out of your market

Make sure people can bet on all possible outcomes

Having an "other" category does the trick

Bots are ok! Bots can be good/bad just like humans

Some of the best "investors" in Google markets are bots

Liquidity

Make sure your market is liquid - do you need an automated market maker?

If Google could change its markets it would consider having an automated market maker for some less active markets

Practicality

Design your market to answer genuinely uncertain questions

For example, give the market 5 plausible forecasts to choose between

If you need to make a yes/no decision, ask your market a yes/no question

The Value of Prediction Markets Beyond Prediction

Collect data about your betters

Google learned about which departments are more accurate, and how accuracy transfers from person to person

Chart of the Day: Information Sharing at Google

The Future of Prediction Markets

More liquid as more people get into it

More use by companies after current recession improves

Data Mining and E-Business: The Social Data RevolutionSTATS 252, Stanford University, Spring 2009

Class time: Monday 2:15 - 5:05 pm

Class location: Gates B01

## Class 4: Decisions

TranscriptsClass Transcript Part 1

Class Transcript Part 2

Audio FilesClass Audio Part 1

Class Audio Part 2

## Statistics

What is statistics about? It is the science that deals with noise, and generalization, which is is trivial if there is no noise. If there is no noise, all you have to do is do a next-neighbor look up, then linear interpolation and you have what you want. What we really want to do is find some way to mottle the noise in a noisy environment so that we have an expected value, etc.## Data

We'll learn about the power of data and meta data (data about data--for example, in communications, the attached importance of e-mails from sender to recipient constitutes the meta data)We then have to design incentives so that people create and share data with us. We can do this now because communications is very cheap now, in terms of storing data, indexing it, distributing it within the network. We also have to bear in mind the hidden costs involved in collecting data, like the cost of annoying people? In the case of our Facebook pages, using "interrupt and repeat" marketing methods may cause users to unsubscribe. What then, is the costs of unsubscription. In the case of Amazon, what is the cost of people calling the call center? If customer service does its job well, it may be a positive benefit when people feel that they'd been taken care of.

## Instrumenting the world

We all carry detection devices--mobile phones, phones with accelerometers. How can we use them to get data?Financial markets are where scales can be particularly important. Consider what one person from GMO (investment management firm) said: By now, the determining factor between hedge funds is literally how far the computer physically sits from where the execution happens. A difference of a few meters determines who moves the market first.

If you globally substitute "metrics" with variables, you are on the right track; if you globally substitute "metric" with "return" or Key Performance Indicator (KPI), you are not. Look at distribution before looking at the mean--keep in mind the flaw of averages.MetricsDon’t just report some number but instead, look at how things change, the slope, the outliers.

Do A/B tests, test early and often. Always collect data from the very beginning--instrument right at beginning and see what works and doesn’t work! Build lo-fi prototypes, stick to the simple stuff and don’t just work on the one big thing.ExperimentsApplications## 4 generations of recommendation systems

## Relevance

How do avoid information overload and determine the order in which items appear on a webpage? Facebook and Twitter lists the most recent item first. But is this the best way? Perhaps people don’t want to give up control to a computer which tells them to read this over that? On the other hand, we don't see Amazon recommending the most recent item to arrive at the warehouse, or Google ranking results by recent activity.## What metrics can we use to determine relevance?

## Tip of the Day

On the Social Data Revolution Facebook page, if you want people to be able to comment on your links, click on the box, and click on Add Link. Do not enter url or html in the box.## Discovery vs. Search

We'll talk about this in the coming weeks.## Industry

## Tools

We have to critically assess social media tools.## Facebook

## Twitter

(to be covered in future)People and Behaviour

(to be covered in future)Vision & the future## Part 1: Decision Analysis

## References

Ronald A. Howard, Decision Analysis Manuscript (Foundations of Decision Analysis)Ronald A. Howard,Readings in Decision AnalysisPeter McNamee and John Celona,Decision Analysis for the ProfessionalThe first to references are the course readers for Professor Howard's DA courses. The third reference is by two consultants from SmartOrg www.smartorg.com, a firm founded by the other co-founder of Decision Analysis Jim Matheson. The book is available under the Creative Commons Attribute No-Derivatives Non-Commercial. Here is the pdf:Decision+Analysis+for+the+Professional.pdf

- Details
- Download
- 1 MB

## Let's make GOOD Decisions

## What is a Decision?

A decision is an allocation of resources that is somewhatirrevocable.There is a cost associated with any decision: choosing this alternative meanscutting awayother alternatives that could have been chosen..Robert Frost on his own poetry:

"One stanza of 'The Road Not Taken

' was written while I was sitting on a sofa in the middle of England: was found three or four years later, and I couldn't bear not to finish it. I wasn't thinking about myself there, but about a friend who had gone off to war, a person who, whichever road he went, would be sorry he didn't go the other. He was hard on himself that way."Bread Loaf Writers' Conference, 23 Aug. 1953

## A good decision is different from a good outcome

We often hear people say things like"That foul turned out to be a bad decision ‘cause they lost the game.", "You shouldn’t have bought this from Amazon ‘cause it was broken on the way.", "I drove drunk but made it home safely, so what’s bad about it?", which represent a common confusion betweenDecisionandOutcome.A good outcome is desirable, but quality of a decision is determined when youmakeit, and by thelogicbehind it!Characteristics of Distinctions## Distinction

## Clarity

- A distinction has to be clear in order to be useful.
- We use a clarity test to ensure there is no confusion in the communication of information.

Examples: the biggest online social network / nota marketing success / failure

an active user of our page / not

## Mutually Exclusive

## Collectively Exhaustive

## Starting point -

Decision Basis## Preferences (What we want)

- More money, pleasure, attention, connections, power, rights...
- Clearly money is not the only thing people are seeking in life.

Example: Professor Howard once offered a deal that goes to the highest bidder. The one who acquires the bid will call "head" or "tails" for a thumb tag. If he calls right he will get $200, but if he calls wrong he will get nothing (and pay his bid amount). What do you think the highest bid will be?To everyone's surprise, the highest bid was $220! Professor Howard asked the student why he would bid such a high amount knowing that he could not make money out of the deal. "It's because I bid $220 that I could stand in front of the classroom to answer this question!" answered the student, "If I bid $180 there might be someone else bidding $190, and if I bid $190 there might be someone else bidding $199, so that I would never win such attention."

## Alternatives (What we can do)

Examples: To accept / reject / ignore a friend requestTo become / not become a fan of a page

## Information (What we know)

Risk Attitude- Their taste for risk
- Their wealth state

Example of deal: we flip an "unfair" coin, and we believe there is a 60% chance of it being head and 40% chance of it being tails. You will get $1 if it is head, or lose $1 if it is tails. How many of you are willing to accept the deal?- About half the classWhat about if the amount is $100?

- About 10$10,000?

- Only 3Conclusion: Value of the deal isnotnecessarily the expectation of the dollar amounts placed on the possible outcome(s)!If you want to go further into risk attitude you can assess your risk tolerance with an exponential equation. In order to do so use the risk odds question.## Trees in Decision Analysis

Typically in decision analysis, we use a structure called "trees" to organize our thoughts and diagram series of possible outcomes. If several "branches" of the tree meet at a square, that indicates that those branches give different options for a decision which the decision maker is facing. If the branches meet at a circle, then the branches indicate possibilities for an uncertain event.In our decision analysis class, we studied this tree, which models Hamlet's famous soliloquy: (read the text here, along with some common interpretations)

While trees typically are used for more quantitative situations, it is important to note that any decision situation -- even one such as this without apparent quantifiable elements -- can be modeled in a decision tree.

## Five Rules

We must follow these five rules in order to use the decision analysis framework.## Probability rule

Decisions can be modeled in terms of possibilities (the potential events that may occur) and probabilities (the chance of those distinct events occurring).## Order rule

Decision maker must be able to order possibilities in descending order of preference. Ties are allowed in the case where the decision maker is indifferent between two prospects.## Equivalence rule

If a decision maker is faced with three non-equal possibilities such that A>B>C, there exists probability p such that the decision maker is indifferent between receiving B for sure vs. a p chance of receiving A and a 1-p chance of receiving B## Substitution rule

If the decision maker has assigned the same value to several deals, those deals may be substituted for one another.## Choice rule

The decision maker must choose the deal with the highest probability of the prospect that he likes better.## Simple Example: Should you send a friend request?

## What is the benefit of sharing information on a social network? What are the costs?

## How would you rank your value of these outcomes?

## How would you assign probabilities to these possibilities?

## Tree Flipping

Why trees? At their heart, they make Bayes' Theorem easy to remember. What is Bayes' Theorem? A quick reminder:Let's start with a simple example:

In this tree, each of the values on the right (a joint probability) is the product of the two corresponding values on the left, ie P(AB) = 0.6 * 0.8 = 0.48. Note that the values for B and B' are dependent on whether or not A occurred. This tree gives:

Here is the above tree after flipping. Note that the top and bottom joint probabilities remain in place, but the middle two swap places in order for the tree to flow correctly. All joint probabilities retain their values from the original tree.

First, we calculate the values for P(B) and P(B') by adding the joint probabilities which stem from the branch corresponding to P(B) or P(B'). For example, P(B) = P(BA) + P(BA') = 0.48 + 0.08 = 0.56.

Remember, by the rules of conditional probability, that P(B|A) = P(AB) / P(A). This means that we can rewrite Bayes' Theorem to say P(A|B) = P(AB) / P(B), by canceling out the P(A) terms. We then use this version of Bayes' Theorem to obtain P(A|B), P(A'|B), P(A|B'), and P(A'|B'). For example, P(A'|B) = P(A'B) / P(B) = 0.08/0.56 = 0.142857, and so on.

A note about probability terminology: in the original tree, A is the prior and B is the likelihood. In the flipped tree, A is the posterior and B is the pre-posterior.

## Decision Tree Tools

If you're interested in using decision trees to model your decision process, check out these two free Excel plug-ins -- they'll perform all the calculations and keep your tree looking neat as you add nodes for decisions and events.

notwork with Mac versions of Office.## Cost of Information

Decision Analysis gives us a framework to measure the value of deals that we face as well as to measure the value of additional information about uncertainties in our models.Assume that you are running an email campaign for your company and the following is true:

· Emails will be sent to 100,000 people

· Response rate will be either 1% or 2%

· There is a 60% chance that the response rate will be 1% and a 40% chance that it will be 2%

· The payoff for a positive response is $30

· The cost of the email campaign to 100,000 customers is $0.40 per email

## Calculate value without a test

This model is simple it has one decision, run campaign or not, and one uncertainty, the response rate. Using a simple model allows us see the framework on which we can build more complex models.This tree calculates the value of this deal for a risk neutral person:

Since the "value" of this deal is positive, $2,000, you would go ahead with the campaign.

## Perfect test

What is the value of knowing the outcome of the uncertainty before you make the decision? Assume that there is a clairvoyant that you can ask. The clairvoyant knows everything about all states past, present, and future. He does not know about probabilities however, so it is up to you to assign those according to your knowledge.Clairvoyance: Test predicts response rate perfectly, i.e. is correct with his predictions 100% of the time.

· How much should you be willing to pay for perfect information?

beforeyou make your decision.## Imperfect test

In the real word, test are less than pefect and in this case we get a test that is 90% accurate. In order to put this into our decision tree, we need to do tree flip in order to incorporate Bayes rule.The reason that we have to do this is because the accuracy of the test is assessed by rating how well the test will go given an outcome. That is if the response rate wil be 2%, then the test says that it will be 2% 9 times out of 10. However, the order we need this to put it into the decision tree is test first, then outcome.

These two diagrams show the process:

The value of a test with 90% accuracy is $6,600 - $2,000 = $4,600

## Testing

## In order for a test to have value it must be:

Remember that the worst test has 50% accuracy, the same as flipping a coin.Relevant: the probability of the test indication given the outcome of the event must be different than the probability of the indication not given the outcome of the event or P(test|event) != P(test|not event)Material: if the uncertainty is resolved, the decision maker would make a different decisionEconomic: the cost of the test must be less than the value of the information that it will give you## Discussion

Framing is perhaps the most important topic in decision analysis. It addresses the questions- What is given?
- What is to be decided now?
- What can be decided later?

The importance of framing is that it decides which uncertainties are included in your model and which are left out. Sensitivity analysis can help to decide. Framing also deals with which revenue and cost values to include in your model and which to leave out.Think about all costs that affect your value.

## More on risk - not covered in class

How you measure risk attitude mathematically? Here I will show one of easiest methods, the risk odds question . Ask yourself at what value of r would you accept the deal to receive r or lose half of r. For instance if r is 100 the deal is a 50-50 chance of winnng $100 or losing $50. Keep raising and lowering the level of r until you reach the point where you are indifferent between taking or rejecting the deal.Another way to think of is to suppose you pay r/2 to play the deal with a 50-50 chance of receiving r or receiving 0.

Using this equation, convert the outcome values into risk space and do the calculation to the base of the tree in that space. When you get to the root node, convert it back into dollar space in order to evaluate the value of the deal. This will tell you which decision to take, given your assumptions.

This following decision tree calculates the value of the risk adjusted deal where r = $100,000 in the risk odds question above. Here the value is $985, quite a bit less than the risk neutral $2,000.

## Take Aways

## Part 2: Prediction Markets

## Reference

Adam Seigel article; good discussion of design parameters: http://weigend.com/files/teaching/stanford/2009/readings/inklingSiegelJPredMkts2009.pdfon Prediction Markets from 2008: http://weigend.com/files/teaching/stanford/2008/readings/PredictionMarkets/

Using Prediction Markets to track information flows: http://weigend.com/files/teaching/stanford/2008/readings/PredictionMarkets%2520CowgillWolfersZitzewitz2008.pdf

http://www.predictionmarketjournal.com/

## Why prediction markets?

Examples: How many (new) fans on FB page?

Will US average gas prices reach $2.50 by june 30?

## Prediction Market definition and introduction:

General information: http://en.wikipedia.org/wiki/Prediction_marketA prediction market can be compared to a stock market for ideas or information. The market rewards good information whether it comes from elites or the masses. Prediction markets have built a track record of besting pundits and pollsters when it comes to predicting everything from political elections to quarterly sales figures.

What is a prediction market?

How can I use a prediction market in my business?

Also, refer to

"Wisdom of the crowds" by James Surowiecki:Under what circumstances is the crowd smarter?

Information elicitation: It is not always obvious where information comes from. Each person holds different information. eg. The janitor knows who stays at work the longest and who is pulling all nighters.

Information aggregation: Need mechanism to summarize what crowd really thinks. Eg. Price of the stock. Parallel to machine learning: Combining weak learners, boosting, etc. Prediction markets allow us to aggregate information from a huge variety of sources.

## Setting up the market

Iowa-like modelThe University of Iowa set up the first prediction market to predict the outcome of the presidential elections in the U.S. in 1988.

## Definition of market

http://en.wikipedia.org/wiki/Prediction_markethttp://inklingmarkets.com/homes/faq

## Definition of shares

http://inklingmarkets.com/homes/faqA share is representative of a specific outcome and its probability or value of occurence as designated by the share price. A purchased share represents the confidence of the purchaser in the specific outcome designated by the share.

## Endowment

If bundle pays off $100, give people $100, and then let them bid## Who should participate

Kick out who know too much? Kick out who knows too little? No, they provide liquidity and have others participate. Everyone can participate.Empirical findings:

People at the top aren't doing better than people at the bottom, eg. CEO vs. people that are 7 degrees away from CEO.

Normalized by trade and by person. Each function will have their own biases.

Usually, most people's final holdings remain similar at the end of a quarter, though there are some people who do well and some people who really stink.

## Prizes

The way that the prizes are designed to encourage participation make or break what the market is.## Discussion--Liquidity: How to get people to participate?

Potential users have a bad experience if nobody else is hanging out there. Feedback, ListsWhat is the effect to make prices public, e.g., in the internal market at a company (e.g., BestBuy, google) be public?

Sharing private information via prediction market is tricky; people are not comfortable with it. For example, what if there was a market on what Hotmail's market share would be after the launch of Gmail, a new, unknown product at the time.

Automated market bot shouldn't be discouraged as it also provides liquidity because it can be wrong.

## Part 3: Bo Cowgill (Google) on Prediction Markets

## Traditional Methods of Forecasting and Their Shortcomings

The problemsThe problems## Characteristics of basic prediction markets

## Market Design Considerations

IntegrityLiquidityPracticality## The Value of Prediction Markets Beyond Prediction

Collect data about your betters## The Future of Prediction Markets

## Prediction Market References

## Markets at Google

http://www.bocowgill.com/GooglePredictionMarketPaper.pdfwww.bocowgill.com

## Popular Existing Prediction Markets

Iowa Electronic MarketsIowa Electronic Health Markets

Trendio (Bet on trends from popular people to popular products)

Foresight Exchange (Bet on anything)

ProTrade (Sports)

HSX (Movies)

Smarkets (Start up stage betting markets)

Hedge Street (Bet on Misc. financial outcomes and commodity prices)

InTrade (Bet on anything)

Inkling (Bet on anything)

NewsFutures (Bet on anything)

## Questions for Bo

## What has changed?

There are many more people doing it now.## What is your prediction of use in 3 years?

There will be a move towards a more liquid kind of trading.## iPhonelogd for iPhone

google group## Initial Contributors

jbjacobs@stanford.edusylviebryant@gmail.com

cheewei@stanford.edu

jyzheng@stanford.edu

erikac@stanford.edu

elegrand@stanford.edu