You are currently browsing Eilon’s articles.

A Markov Decision Problem (MDP) is a model for sequential decision making, in which the underlying state of nature evolves in a stationary way. An MDP is given by a set of states S, an initial state s(0) in S, a set of available actions A(s) for each state s in S, and, for each state s in S and available actions a in A(s), a payoff r(s,a) and a probability distribution q(. | s,a) on S.

The process starts at the initial state s(0). At every stage n, the current state s(n) is known, the decision maker chooses an action a(n) in A(s(n)), receives the stage payoff r(s(n),a(n)), and a new state s(n+1) is chosen according to q(. | s(n),a(n)) and is told to the decision maker. The decision maker’s goal is to maximize the discounted sum of his stage payoffs:

sum-over-n-from-0-to-infty of λ-to-the-power-n times r(s(n),a(n)).

The value of the MDP, that is, the maximum that the decision maker can obtain, depends on the discount factor λ. Denote by v(λ) the value function of the MDP. Which functions can be obtained as the value function of some MDP with finite sets of states and actions?

From now on I restrict the discussion to MDP’s with finite sets of states and actions. Blackwell (1965) proved that for every discount factor λ the decision maker has a pure stationary optimal strategy. It is easy to see that the payoff that corresponds to a pure stationary optimal strategy is the solution of a set of equations, which are linear in λ, and whose coefficients are determined by the payoff function r and the transition function q. It follows that for every pure stationary strategy σ, the corresponding payoff function g(λ ; σ,s) is a rational function of λ. Since there are finitely many pure stationary strategies, we deduce that the value function is the maximum of finitely many rational functions.

Can we obtain any maximum of rational functions as the value function of some MDP? The answer is negative. For example, since the set of states and actions are finite, the payoff function r is bounded, say, by M. In particular, the payoff function of any strategy is bounded by M/(1-λ). In particular, any rational function whose denominator has a root inside the unit ball of the complex plane, or that has a root on the unit ball of the complex plane that has multiplicity larger than 1, cannot be the value function of an MDP.

Is that the only restriction that we have? The answer is still negative. It is not difficult to see that the roots of the denominator of the payoff function of a pure stationary strategy are the inverse of the eigenvalues of the transition matrix, which by a known result in matrix theory must be unit roots, that is, for any root ω of the denominator (which is a complex number) there is an integer k such that ω-to-the-power-k is equal to 1. Thus, a rational function whose denominator has a root that lies on the unit ball of the complex plane and is not a unit root cannot be the value function of an MDP.

Is that all? Yes. Let F be the set of all rational functions f : [0,1) → **R **that satisfy the following property: any root of the denominator either (a) lies outside the unit ball of the complex plane, or (b) lies on the unit ball of the complex plane, has multiplicity 1, and is a unit root. Let V be the set of all functions that are the maximum of finitely many functions in F. A function v is the value function of some MDP if and only if it is in V.

In this post I outlined one direction of the proof. Anyone who is interested in reading the construction that proves the other direction is referred to this paper.

In a previous post I wrote on my experience as a consultant to participants in auctions. I was interested to hear the other side of the coin: how do the ones who set the rule of the auction perceive the competitive situation they are in charge of. To answer this question I met Dorit Levy Tyller, a well known Israeli advocate who has decades of auctions in her professional past.

According to Ms. Levy Tyller, the issue that bothers her most is coordination among the bidders. In a small country like Israel, in which everybody knows everybody else and have friends who know the rest, participants do their best to talk, exchange information, and dissuade others from increasing their bid. When the bidders are all present at the same hall, the auction turns into an oriental bazaar with a lot of psychological pressure on participants.

To overcome this difficulty, Ms. Levy Tyller assigns each participant to a different room and asks the participants to arrive to their designated rooms at different times. During the auction she moves with her team from one room to the next, informing each participant of the current highest bid and asking them whether they increase their bid. This is a slow process that requires the participants’ trust in the auctioneer, a trust that she gained with the dozens of auctions she had already organized.

What are the issues that affect the participants’ behavior? According to Ms. Levy Tyller, the expectation to win the auction and the tension that builds along the process causes the bidders to increase their bids. Pressure from other participants, on the other hand, hinders price increase.

Most winners are the calculated and level-headed participants. Anxious bidders who make plenty of noise usually quit before the end. Moreover, those who come prepared and know well the status of the auctioned item, tend to win more often.

At the end of our conversation Ms. Levy Tyller admitted that I was the first game theory consultant she ever met. I take it as a good sign: the utility function of game theorists puts higher weight to research and teaching than to consulting jobs. I am glad to be part of this group.

This is not a post about academia, economics, or game theory. It is a post about life.

Today I am going to meet my French teacher for our weekly session, and I wanted to prepare an interesting topic for discussion. I pondered what can be an appropriate subject, and the words “the most gifted woman in the world” popped into my mind. Who is that woman and what did she do?

I typed “The most” in Google search bar. Auto-complete had few suggestions:

The most violent year

The most expensive car

The most expensive watch

The most dangerous game

The most beautiful woman

I went on. “The most gifted”. Auto-complete tried to read my mind:

The most gifted psychics

The most gifted child in the world

The most gifted man who ever lived

The tenth option was

The most gifted rapper in Nigeria

yet no mention of women.

Let’s continue. “The most gifted wo”. In how many ways can you complete this search? Two.

The most gifted child in the world

The most gifted person in the world

Fine, let’s add another letter. “The most gifted wom”.

Nothing. Not a single auto-completion. I beat Google. Probably everyone else know the answer and therefore nobody looked for it before me.

I am a stubborn guy. “The most gifted woman in the world”. Google is even more stubborn than me. They figured out that I mistyped my search phrase, and made few suggestions, in which the word “woman” was replaced by another boldfaced word: “child” and “person”. In fact, they thought that it is more probable that I search for “The most gifted psychics review” than for a gifted woman. Does this say something about me, about Google, or about the phrases that we search on the internet?

I did not give up and pressed “Enter”. The list that I got involved only (female) singers. Only? Almost. The tenth link was to “Support for Gifted Mothers: America Is Not a World Leader”.

I suspected that my misunderstanding with Google is due to language issues, and that the word “gifted” might refer to people’s vocal abilities, so I searched for “The most gifted man in the world”. Unfortunately I did not get any male singer. The first three pages referred me to the TV series “A Gifted Man”, number 30 pointed to King Solomon on a christian site, yet the title of number 32 was more promising: “Ten People with Unbelievable Talents”. Yes, I told myself, I finally got a proof that our world is full of chauvinism. I clicked on the link and found out that the most talented person in the list succeeded in pulling a truck with his XXXX. I did not bother to check the achievements of the other nine.

If this is the most significant accomplishment of men, no wonder auto-completion failed.

The world may be interested in the whereabouts of Vladimir Putin and whether Elton John’s boycott of Dolce and Gabbana turns out successful. Here, in the holy land, we wait for the elections to the parliament that will take place this Tuesday. The elections, which should have been an easy stroll for acting prime minister Benyamin Netanyahu, turned out to be a difficult trek.

There are 120 parliament members and two main candidates for the position of prime minister: Benyamin Netanyahu, who heads the right-wing parties, and Isaac Herzog, who heads the center-left parties. The way a government is formed is that after votes are counted and the 120 seats are divided among the parties, largely in proportion to the number of votes they got, the leader of each party recommends to the President of the state a person, which that party backs for the post of prime minister. The President then asks the person who he believes has the highest chance of creating a coalition to try and make a coalition. If that person fails, then the President asks another person to form a coalition, and so on.

Latest polls say that the parties that support Netanyahu will get about 55 seats, and the parties that support Herzog will get roughly the same number of seats. The remaining 10 seats will go to a newly formed party, headed by Moshe Kahlon, who said that he would recommend nobody to the President. Why did he say it? On the one hand, Kahlon was a parliament member of Netanyahu’s party (and a minister in Netanyahu’s government), and his voters come from the right wing. Saying that he will recommend Herzog is a death strike to his party. On the other hand, he was very critical about Netanyahu’s conduct and poses himself as “the true representative of the right wing”, and so he cannot say that he will recommend that person whom he thinks is not a good prime minister.

Who will then be the prime minister? PredictIT.com estimates at 56% the probability that Netanyahu is reelected. And if not? Then Herzog, as there is no one else.

Almost. Think of the situation from the eyes of game theory. Suppose that neither Netanyahu nor Herzog are recommended by more than half the votes. Herzog is in a bad shape: the Arab party is supposed to recommend Herzog, and polls predict that it will get about 13 seats. Traditionally the Arab parties are not invited to be part of the government: Herzog will have to make a coalition of more than 60 members excluding the Arab party; this is tough. Netanyahu is in a bad shape as well; polls predict that his party will lose quite a few seats, and if he will not be recommended by more than half the members of the parliament, he might be kicked out by his own party. So who can be the new prime minister?

I ask you to consider the swing voter, Kahlon. Since he comes from the right wing, all parties in the right wing will prefer him to Herzog. All the parties in the left wing will prefer him to Netanyahu or to any other candidate from the right wing. This is a hell of a reason not to recommend any of the two to the President.

Is my theory right? Will the swing voter use his power to become the real winner of the elections? We will have to wait a few weeks and see for ourselves.

This last week on Monday the Israeli Ministry of Communication auctioned 8 bands (each band of 5 mega-herz) for the use of fourth-generation cellular communication. The auction was interesting from a game theoretic perspective, and so I share it with the blog’s readers.

**The rules of the auction are as follows:**

- The auction is conducted on the internet, and each participant sits in his office.
- The minimum bid is 2,000,000 NIS per mega-herz (meaning 10M per band).
- Bids must be multiples of 100,000.
- No two bids can be the same: if a participant makes a bid that was already made by another participant, the e-system notifies the participant that his bid is illegal.

The auction is conducted in round.

In the first round:

- Each bidder makes a bid, which consists of the number of bands it asks for and the price he is willing to pay for each mega-herz. As mentioned above, no two participants can offer the same price.
- The e-system allocates the 8 band between the participants according to the bids: the bidder with the highest bid gets the number of bands it asked for, then the bidder with the second highest bid, and so on, until the 8 bands are exhausted.
- Each participant is told the number of bands it was allocated.
- The price in which the 8th band was allocated is called the threshold and posted publicly.
- The minimum price for the next round is the threshold + 200,000 NIS.

Two governments did not survive this week: the Swedish and the Israeli. Here, in Israel, people are interested in the effect of the coming elections on the financial market. The Marker, the most important national daily economics newspaper, published an article on this issue. The chief economist of the second largest investment house, which handles about 30 billion USD, is quoted as saying (my own translation)

“Past experience shows that most of the time, during six months after elections the stock market was at a higher level than before the elections,” emphasized Zbezinsky (the chief economist, ES). The Meitav-Dash investment house checked the performance of the TA-25 Index (the index of the largest 25 companies in the Israeli stock exchange, ES) in the last six elections. They compared the index starting from 6 months before elections up to six months after elections, and the result was that the average return is positive and equals 6%.

To support this claim, a nice graph is added:

Even without understanding Hebrew, you can see the number 25 at the title, which refers to the TA-25 index, the six colored lines in the graph, where the x-axis measures the time difference from elections (in months), and the year in which each elections took place. Does this graph support the claim of the chief economist? Is his claim relevant or interesting? Some points that came up to a non-economist like me are:

- Six data points, this is all the guy has. And from this he concludes that “most of the time” the market increased. Well, he is right; the index increased four times and decreased only twice.
- Election is due 17-March-2015, which means three and a half months. In particular, taking as a baseline 6 months before election is useless; this baseline is well into the past.
- Some of the colored lines seem to fluctuate, suggesting that some external events, unrelated to elections, may have had an impact on the stock market, like the Intifada in 2001 or the consequences of the Lebanon war before the 2009 elections. It might be a good idea to check whether some of these events are expected to occur in the coming nine months and a half.
- It will also be nice to compare the performance around elections to the performance in between elections. Maybe 6% is the usual performance of the TA-25, maybe it is usually higher, and maybe it is usually lower.

I am sure that the readers will be able to find additional points that make the chief economist statement irrelevant, while others may find points that support his statement. I shudder to the thought that this guy is in charge of some of my retirement funds.

Abraham Neyman and Sergiu Hart are two of the prominent mathematical game theorists to date. Neyman contributed immensely to the study of the Shapley value, stochastic games, and repeated games and complexity. Hart contributed significantly to the study of correlated equilibrium and adaptive processes leading to it, value theory, and formation of coalitions.

Both Abraham and Sergiu will be 66 next year. To celebrate this rare occasion, the Center for the Study of Rationality at the Hebrew University of Jerusalem organizes two conferences, one in honor of each of them. The conference in honor of Abraham will be held on June 16–19, 2015, and the conference in honor of Sergiu will follow on June 21–24, 2015.

Mark the dates and reserve tickets.

Last week I wrote a post about two issues with Elsevier’s e-system, which is the system that all journals run by Elsevier, including *Games and Economic Behavior* and *Journal of Mathematical Economics*, use for handling submissions: the fact that sometimes reviewers can see the blinded comments that other reviewers wrote to the editor, and the user agreement that allows Elsevier to change its terms without notifying the users.

After I corresponded with the editors of *Games and Economic Behavior* and *Journal of Mathematical Economics* and with the Economics Editor of Elsevier, the reason for the privacy breach became clear: the e-system allows each editor to choose whether the blinded comments of one referee to the author and the blinded comments of one referee to the editor will be seen by other reviewers. For each type of blinded comments the editor can decide whether to show it to all reviewers or not. Each editor makes his or her own choice. I guess that often editors are not aware of this option, and they do not know what was the choice that the previous editor, or the one before him, made.

Apparently, the configuration of *Games and Economic Behavior* was to allow reviewers to see only the blinded comments to the author, while the configuration of *Journal of Mathematical Economics* was to allow reviewers to see both types of blinded comments. Once the source of the problem became clear, Atsushi Kajii, the editor of *Journal of Mathematical Economics* decided to change the configuration, so that the blinded comments of reviewers to the editor will not be seen by other reviewers. I guess that in few days this change will become effective. Elsevier also promised to notify all of its journals, in which the configuration was like that of JME, about this privacy issue, and let the editors decide whether they want to keep this configuration or change it. In case this configuration remains, they will add a warning that warns the referee that the blinded comments can be read by other reviewers.

I am happy that the privacy breach came to a good end, and that in the future the e-system will keep the privacy the referees.

Regarding the second issue, Elsevier is not willing to change its user agreement. Reading the user agreements of other publishers, like Springer and INFORMS, shows that user agreements can be reasonable, and not all publishers keep the right to change the user agreement without notifying the users. The Economics Editor of Elsevier wrote: “This clause is not unreasonable as the user can choose to discontinue the services at any time.” As I already wrote in the previous post, I choose to discontinue the service.

*Games and Economic Behavior*, *Journal of Economic Theory*, *Journal of Mathematical Economics*, and *Economics Letters* are four journals that publish game theoretic papers and are published by Elsevier. They all use Elsevier e-system to handle submissions. I already talked in the past about the difficulty of operating these e-systems. Rakesh discussed the boycott against Elsevier. Recently I had some experience that made me stop using the Elsevier’s system altogether, even though I serve on the editorial board of *Games and Economic Behavior*. I will not use Émile Zola’s everlasting words for such an earthly matter; I will simply tell my experience.

**1) The e-system seems to be sometimes insecure.**

I was surprised when a referee with whom I consulted on the evaluation a paper (for GEB) told me that the system showed to him the private message that the other referee wrote to me, and that the same thing happened to him with JME. To prove his point, he sent to me screenshots with the private letter of the other referee for JME.

**2) The user agreement of Elsevier is a contract that one should never agree to sign.**

I guess no one bothered to read the user agreement of Elsevier. I did. The first paragraph binds us to the agreement:

This Registered User Agreement (“Agreement”) sets forth the terms and conditions governing the use of the Elsevier websites, online services and interactive applications (each, a “Service”) by registered users. By becoming a registered user, completing the online registration process and checking the box “I have read and understand the Registered User Agreement and agree to be bound by all of its terms” on the registration page, and using the Service, you agree to be bound by all of the terms and conditions of this Agreement.

The fourth paragraph, titled “changes” says that any change made to the contract is effective immediately, and so it binds you. If you want to make sure they did not add some paragraph to which you disagree, you must read the whole user agreement every time you use the system.

Elsevier reserves the right to update, revise, supplement and otherwise modify this Agreement from time to time. Any such changes will be effective immediately and incorporated into this Agreement. Registered users are encouraged to review the most current version of the Agreement on a periodic basis for changes. Your continued use of a Service following the posting of any changes constitutes your acceptance of those changes.

I contacted Elsevier about the user agreement and got the following response:

The Elsevier website terms and conditions (see http://www.elsevier.com/legal/elsevier-website-terms-and-conditions) cannot be customized upon request; however, these terms and conditions do not often change and notification would be provided via the “Last revised” date at the bottom of this page. The current terms and conditions were Last revised: 26 August 2010.

Well, it is comforting that they did not make any change in the past four years, but will Elsevier’s CEO agree to open an account in a bank that has the “change” paragraph in the contract?

I stopped using the e-system of Elsevier, both as a referee and as an editor.

Roscoff is a village at the north-west corner of France, located on a small piece of land that protrudes into the English canal. Right here, in 1548, the six-year-old Mary, Queen of Scots, having been betrothed to the Dauphin François, disembarks.

As far as I understood, the most common sights in the area are tourists and sea food. As far as I can tell, the main advantage of Roscoff is the Laboratoire Biologique, which is used to host conferences. Every now and then the French game theory group makes use of this facility and organizes a conference in this secluded place. The first week of July was one of these nows and thens. This is my third time to attend the Roscoff conference, and I enjoyed meeting colleagues, the talks, and the vegetarian food that all non-sea-food eaters got.

Here I will tell you about one of the talks by Roberto Cominetti.

Brouwer’s fixed point theorem states that every continuous function $f$ that is defined on a compact and convex subset $X$ of a Euclidean space has a fixed point. When the function $f$ is a contraction, that is, when there is $ρ ∈ [0,1)$ such that $d(f(x),f(y)) ≤ ρ d(x,y)$ for every $x,y \in X$, then Banach’s fixed point theorem tell us that there is a unique fixed point $x*$ and there is an algorithm to approximate it: choose an arbitrary point $x_0 ∈ X$ and define inductively $x_{k+1} = f(x_k)$. The sequence $(x_k)$ converges to $x*$ at an exponential rate.

When the function $f$ is non-expansive, that is, $d(f(x),f(y)) \leq d(x,y)$ for every $x,y \in X$, there may be more than a single fixed point (e.g., $f$ is the identity) and the sequence defined above need not converge to a fixed point (e.g., a rotation in the unit circle).

In his talk, Roberto talked about a procedure that does converge to a fixed point when $f$ is non-expansive. Let $(α_k)$ be a sequence of numbers in $(0,1)$. Choose $x_0 ∈ X$ in an arbitrary way and define inductively $x_{k+1} = α_{k+1} f(x_k) + (1-α_{k+1}) x_k$. Surprisingly enough, under this definition the distance $d(x_k,f(x_k))$ is bounded by

d(x_k,f(x_k)) ≤ C diameter(X) / \sqrt( α_1 (1-α_1) + α_2 (1-α_2) + … + α_n (1-α_n) ),

where C = 1/\sqrt(π).

In particular, if the denominator goes to infinity, which happens, for example, if the sequence $(α_k)$ is constant, then the sequence $(x_k)$ converges to a fixed point. Since the function that assigns to each two-player zero-sum strategic-form game its value is non-expansive, this result can become handy in various situations.

This is a good opportunity to thank the organizers of the conference, mainly Marc Quincampoix and Catherine Rainer, who made a great job in organizing the week.

## Recent Comments