Tagged: forecasting

Mathis Lohaus

Scenarios for European External Relations in 2025

Dahrendorf Symposium

Last week I had the pleasure of attending (parts of) the 2016 Dahrendorf Symposium hosted by Hertie School of Governance, LSE and Mercator foundation. The event focused on European foreign policy. I will summarize the debates on the final day in a separate blog post.

A few months ago, Hertie School hosted a scenario planning workshop as part of the Dahrendorf project. It focused on the EU’s relations to other world regions, trying to draw up scenarios for the year 2025. Meeting in five different working groups, the participants developed scenarios for the future relations between the EU and the U.S., China, Russia and Ukraine, Turkey, and the MENA region. Given my interest in forecasting and curiosity about scenario planning, I gladly signed up and contributed to the EU/U.S. working group.

At the Dahrendorf Symposium last week, Monika Sus and Franziska Pfeifer (who are coordinating the scenario project) briefly described our method and results to the audience. The publication with our 18 (!) brief scenarios is available via the Dahrendorf blog: European Union in the World 2025 – Scenarios for EU relations

The results are interesting and I really encourage you to download the document! Personally, I particularly enjoyed the process. It was a great exercise to think about  basic assumptions we have about transatlantic relations; to identify key drivers relevant for change; and to come up with scenarios that reflect the most relevant combinations of key drivers taking particular directions.

Transatlantic mistrust on tech
Illustrations for the scenario report by Jorge Martin

Let me indulge in a bit of self-promotion and quote the intro to my group’s scenario:

“In the years up to 2025 there will be a situation of balkanised technological regulation in the EU, driven by political debates which emphasise the need to shield national markets and societies against the uncertain effects of technological progress. On the other side of the Atlantic, political leaders will continue to embrace new technologies, with an emphasis on keeping the competitive edge also in terms of offensive capabilities in the cyber and AI realms. Only after a series of trigger events, increasing the pressure on decision-makers, will transatlantic leaders be willing to invest in a new institutional framework to manage the political problems associated with technological progress.” (‘Transatlantic Frankenstein’ scenario)

Then, of course, there was the Dahrendorf Symposium, which included a couple of workshop sessions (that I couldn’t attend) and two round-table panels on the final day. I will put my summary of these discussions into a separate post.

Mathis Lohaus

The Amateur Forecaster’s Diary Pt. 4

Good Judgment Project

This series returns because of one email to the author popular demand. If you have no idea what this is about, please consult parts one, two and three.

As I mentioned in my last post on the Good Judgment Project (GJP), in season 3 I was part of a team of “super forecasters”.  My team did OK, but we were significantly less successful than the other “supers”. This season has now come to an end and the project is about to launch the next one in August. I’d like to offer some reflections.

What went well?

We were able to exchange information across “super” groups, and some people were really impressive. I saw spread sheets and discussions that were much more sophisticated than what I expected in this “just for fun” setting. Apparently, the increasingly challenging task (with many rather tough questions and a high work load) really sparked the participants’ ambitions.

From what I could see in the cross-team forum, communication was very lively and mostly helpful. But I’m not sure how much it really mattered in the end. Given the number of questions and the old-fashioned bulletin board format, it was hard to keep up with every possibly relevant bit of information.

My own team brought together people from very different backgrounds, and it was nice to have a round of introductions. Given that the questions belong to different clusters, we proceeded to assign everyone two areas to prioritize and researching and answering questions.

What didn’t go so well?

Judging from the experience in my particular group, not everything was “super” in the end. My new group did not invest much more time and effort than my previous, “normal” teams. After all, the incentives and potential pitfalls are very similar: if communication does not yield results, people stop typing long messages; if there are only three or four active users, the team cannot perform as well; high cognitive load due to many open questions can be discouraging.

One innovation for season 3, the use of (paid) facilitators to ensure smoothly working teams, fell completely short — at least in our group. In theory, the facilitator would have helped us with coordinating tasks and making sure that no items are forgotten. But the person in charge did not really live up to that promise, and the only email I can remember getting from him is a goodbye note. This might be worth trying again, though.

Finally, one lesson I draw from my experience in the newly established and ultimately lowest-ranking “super” team: Being put in an environment that’s supposedly excellent and seeing the amount of work and experience the other teams bring to the table can be intimidating. I’m not sure how much work the GJP organizers should put into creating a positive team spirit and provide regular feedback, but at least in our case it might have helped.

The future of forecasting

I’m curious to see whether season 4 will be able to push the limits of what works in a “just for fun” effort. (After all, teenagers in World of Warcraft guilds spend a lot of time for coordination and planning, too.) It seems that the Good Judgment Project operators are considering some changes to the user interface to help manage the work load. I agree that there is some room for improvements.

I would love to hear what the GJP researchers have to say on the merit of inter-group exchange. Theoretically, it could either lead to group think or help everyone improve by leading to efficient information sharing. Generally, I am not sure to what extent the “meta” discussions and well-meaning exchange of tips that took place between teams are at odds with optimizing performance. Given that everyone has limited resources, maybe this process should be streamlined and formalized. Easy sharing of links to news sources is probably a good idea.


Of course none of that can change the fact that many questions are just impossible to answer with any kind of certainty. For example: predicting the behavior of small groups with secretive proceedings (Vatican, North Korea, Taliban…) or factoring in different layers of scientific and political uncertainty (“will the Swiss lab report that Arafat’s body contained a significantly elevated level of polonium-210?”). But it’s still fun to try your best, and I’m looking forward to season 4.

Mathis Lohaus

Links: Forecasts for 2014; How to Rank IR Journals

(CC) Sanjay Acharya
(CC) Sanjay Acharya via Wikimedia Commons

Today’s your last chance to take part in New America’s Weekly Wonk 2014 Forecasting Contest. Most of the questions are brilliant, but my favorite is #4:

Which number will be highest in 2014?

  • Gold medals won by China in the Sochi Winter Olympics
  • Oscars won by American Hustle
  • Days the temperature exceeds 100 degrees Fahrenheit in Washington, D.C.
  • Mentions of the word “progress” by Barack Obama at his State of the Union Address

I’ll pick my answers after finishing this blog post… (via Tobias Bunde)

A far more serious forecasting exercise was just published by Jay Ulfelder. He and Ben Valentino used a survey to plot the likelihood of “state-led mass killings” around the world:

Data & map by Jay Ulfelder and Ben Valentino (Dart-Throwing Chimp)

It is very important to understand that the scores being mapped and plotted here are not probabilities of mass-killing onset. Instead, they are model-based estimates of the probability that the country in question is at greater risk than any other country chosen at random. In other words, these scores tell us which countries our crowd thinks we should worry about more, not how likely our crowd thinks a mass-killing onset is.

Also note that less than 150 people took part in the survey. Still, the method is neat.

In another post, Jay sheds some light on how and when he is going to post his 2014 forecasts for coups. Again, it turns out that attaching probabilities to very rare events is extremely tough. The post also illustrates that choosing a data source can be driven more by administrative reasons (who publishes when, how often, and how transparently) than by trust in its quality (or: optimal fit for the purpose)…

At the Duck of Minerva, Brian J. Phillips explores how to rank IR journals:

What are the best International Relations journals? How do we know if one journal is better than another? And how should this affect your decision about where to send a manuscript?

Phillips considers surveys (TRIP) and citation indexes (Thomson-Reuters, Google, etc.) that might be relevant to come up with a ranking. You won’t be surprised by the top 10. But the data, the discussion and follow-up questions, and the (critical) comments are well worth reading…

Mathis Lohaus

Links: Drones; Forecasting; Ranking Researchers; Surveillance Logic

A combat drone, via Wikimedia commons
A combat drone, because that’s the most photogenic of all topics covered here today… (Wikimedia commons)

I hope you’re having a great week so far! My fellow bloggers have other obligations, so you’ll have to tolerate my incoherent link lists for the time being…

At the Duck of Minerva, Charli Carpenter makes a crucial point regarding the debate on military drones (emphasis added):

In my view, all these arguments have some merit but the most important thing to focus on is the issue of extrajudicial killing, rather than the means used to do it, for two reasons. First, if the US ended its targeted killings policy this would effectively stop the use of weaponized drones in the war on terror, whereas the opposite is not the case; and it would effectively remove the CIA from involvement with drones. It would thus limit weaponized drones to use in regular armed conflicts that might arise in the future, and only at the hands of trained military personnel. If Holewinski and Lewis are right, this will drastically reduce civilian casualties from drones.

I’d like to recommend a couple of links on attempts to forecast political events. First, the always excellent Jay Ulfelder has put together some links on prediction markets, including a long story in the Pacific Standard on the now defunct platform Intrade. Ulfelder also comments on “why it is important to quantify our beliefs”.

Second (also via Ulfelder), I highly recommend the Predictive Heuristics blog, which is run by the Ward Lab at Duke University. Their most recent post covers a dataset on political conflict called ICEWS and its use in the Good Judgment Project, a forecasting tournament that I have covered here on the blog as well. (#4 of my series should follow soon-ish.)

A post by Daniel Sgroi at VoxEU suggests a way for panelists in the UK Research Excellence Framework (REF) to judge the quality of research output. Apparently, there is a huge effort underway to rank scholars based on their output (i.e., publications) — and the judges have been explicitly told not to consider the journals in which articles were published. Sgroi doesn’t think that’s a good idea:

Of course, economists are experts at decision-making under uncertainty, so we are uniquely well-placed to handle this. However, there is a roadblock that has been thrown up that makes that task a bit harder – the REF guidelines insist that the panel cannot make use of journal impact factors or any hierarchy of journals as part of the assessment process. It seems perplexing that any information should be ignored in this process, especially when it seems so pertinent. Here I will argue that journal quality is important and should be used, but only in combination with other relevant data. Since we teach our own students a particular method (courtesy of the Reverend Thomas Bayes) for making such decisions, why not practise what we preach?

This resonates with earlier debates here and elsewhere on how to assess academic work. There’s a slippery slope if you rely on publications: in the end, are you just going to count the number of peer-reviewed articles in a CV without ever reading any of them? However, Sgroi is probably right to point out that it’s absurd to disregard entirely the most important mechanism of quality control this profession has to offer, despite all its flaws.

Next week, the Körber-Stiftung will hold the 3rd Berlin Foreign Policy Forum. One of the panels deals with transatlantic relations. I’m wonder if any interesting news on the spying scandal will pop up in time. Meanwhile, this talk by Dan Geer on “tradeoffs in cyber security” illustrates the self-reinforcing logic of surveillance (via Bruce Schneier):

Unless you fully instrument your data handling, it is not possible for you to say what did not happen. With total surveillance, and total surveillance alone, it is possible to treat the absence of evidence as the evidence of absence. Only when you know everything that *did* happen with your data can you say what did *not* happen with your data.

Mathis Lohaus

Links: Elections, Constitutions, PhDs, Instability, and Teaspoons

The teaspoon population in the author's research center
The teaspoon population in the author’s research center

Mark Kayser and Arndt Leininger sum up the results of their German election forecasting model and compare it to others. They had predicted a share of 47% for CDU/CSU and FDP (very close to the actual 46.3%). But they also point out that it’s much harder to predict the stability of coalitions…

Our model drew on previous election outcomes, characteristics of the government and of voters and, most originally, the relative economic performance of Germany in comparison to the two other most important economies in Europe (…). Our model fared at least as well as traditional polling, making us optimistic about the future of forecasting elections in general and forecasting German elections in particular.

The Comparative Constitutions project has launched a great new website called “Constitute” allowing everyone to get to know constitutions from all over the world. You can browse by country or by topic, but it seems that older versions are not included (via Monkey Cage).

Henry Farrell compares the controversy about the analyst Elizabeth O’Bagy to the case of former German defense minister Karl-Theodor zu Guttenberg, who had to resign in Germany (for plagiarism in his dissertation), but now works at a respected D.C. think-tank:

O’Bagy’s academic credentials were crucial to her status as an ‘expert.’ When these credentials exploded, so did her career. Zu Guttenberg’s value rests not on his purported academic training, but on his past political role and current political connections.

Jay Ulfelder argues that we live in a time of systemic instability, which is only inadequately captured by observers that stick to a perspective where “countries are a bit like petri dishes lined up on a laboratory countertop”. So we ought to think harder about connecting the dots between state failures, increasing piracy, the financial crisis, food prices, and long-time cycles of social unrest (which look slightly esoteric to me)…

…and since it’s Friday: Please make sure to read this research paper on the fate of teaspoons placed in the communal rooms of university research labs (via MR).

56 (80%) of the 70 teaspoons disappeared during the study. (…) The half life of teaspoons in communal tearooms (42 days) was significantly shorter than for those in rooms associated with particular research groups (77 days). The rate of loss was not influenced by the teaspoons’ value. (…) At this rate, an estimated 250 teaspoons would need to be purchased annually to maintain a practical institute-wide population of 70 teaspoons. (…) The loss of workplace teaspoons was rapid, showing that their availability, and hence office culture in general, is constantly threatened.

Sören Stapel

Links: German elections, grad student advice, IL/IR symposium, O’Bagy

Election Day in Germany is on Sunday. Yesterday was the information event for my tasks as a poll worker on Sunday. As we all know, Germans are said to be very organized and efficient, but can be harsh. This event proved the rule. And I feel like making fun about one specific disadvantage of being German:

German elections and forecasting

Back to serious issues. A few weeks ago I somehow lamented about the state of forecasting Germany’s federal elections in 2013. Sadly, I wasn’t aware of Kai Arzheimer’s work. In the mid of August, he has launched a series of blog posts on forecasting the German elections and some follow-ups here, here, here, and here. But you could also have a glance at his code and data for replication or just visit his blog in general which is very entertaining.

He also has a piece in the online edition of Al Jazeera on Germany’s elections, the EU, and the future of the Euro.

The European Council on Foreign Relations is currently running a great series looking at how the German elections being viewed from by other EU partners. So far, the series covered Poland, France, Italy, Bulgaria, Britain, and Spain.

Scholars from the Social Science Research Center in Berlin (WZB) have looked at party manifestos of all German federal elections. Their data is now available and they have published some at the Democracy & Democratization blog. See also their introduction to the Manifesto project. The online edition of the newspaper Die Zeit also presented some of their findings (in German). The base line is: political parties differ on many issues in their party manifestos and there is a general turn to the left regarding both economic and socio-political dimensions (less market-oriented and more progressive). But, of course, exceptions prove the rule. Continue reading

Sören Stapel

Elections in Germany: Forecasts and Polls

The election campaign in Germany is about to gather speed with less than 30 days left until election day. I assume we’re going to cover that in more depth soon, too. For now, I can direct you to the Hertie School’s Expert Blog on the German Federal Elections in 2013 in case you have not checked it out yet. They cover a plethora of topics from labour market policies and the German Energiewende to gender equality and family policy.

So let me do the kick-off for some posts that will appear on this blog over the next couple of weeks. Yet, this is not about politics but looks at polling and forecasting in the German case, thereby briefly touching upon some of the recent trends of the German political landscape. I will point out some of the flaws of both polling and forecasts. However, don’t misread the point: I’m not against polls and forecasts as such and have lots of fun following the respective discussion throughout the year. But we should not overemphasize these results, either, as both do not come without problems.

Continue reading

Mathis Lohaus

The Amateur Forecaster’s Diary Pt. 3

Good Judgment Project

At the beginning of august, the Good Judgment Project (GJP) has kicked off the third round of forecasting. As I’ve described in two earlier blog posts (#1 and #2), the idea is to come up with the best way to predict the outcomes of political negotiations and elections, or just the likelihood of hypothetical events.

I’ve been promoted to one of eight teams of “superforecasters”. One of the perks coming with this upgrade was seeing the training materials before anyone else. The GJP researchers also invited us to a small conference/workshop in Philadelphia to discuss last season and prepare for the new one. I was unable to attend, but the slides were very informative. In addition, each “super” team now has a facilitator assigned to it, who is meant to help with coordination.

According to the info given by Phil Tetlock, the GJP convincingly won the first two rounds of the tournament. The four competing research programs were now shut down and some of their forecasters joined this project. Judging from the GJP’s repeat success, geopolitical forecasting rewards skill rather than luck. This is supported by the GJP’s internal data: 50 of the 60 top forecasters from season 1 ended up at the top in season 2. So there appears to be less regression to the mean than one might expect.

Slide from Phil Tetlock's presentation, illustrating the scale of possible Brier scores for measuring forecast accuracy
This slide from Phil Tetlock’s presentation illustrates the scale of Brier scores for measuring forecast accuracy

Continue reading

Mathis Lohaus

Links: Taking Kids on Field Trips; Forecasting; Cyber Security; Syria’s Future; Football and Violence; New UN Blog; Honest Acknowledgments

Temperatures in Berlin are falling. Let’s wait and see what this means for the blog…

A great match to our little series on parenting:  Kim Yi Dionne writes about “taking children to an African country while you conduct research” (via the Duck)

Jay Ulfelder has two great posts on forecasting. One deals with common “screw-ups” in predictive models. The other is about the ethics of statistical forecasting, and the responsibility of researchers to be honest about their limits:

The fact that we use mathematical equations to generate our forecasts and we can quantify our uncertainty doesn’t always mean that our forecasts are more accurate or more precise than what pundits offer, and it’s incumbent on us to convey those limitations. It’s easy to model things. It’s hard to model them well, and sometimes hard to spot the difference.

Brandon Valeriano offers a comprehensive reading list on cyber security, nicely balancing intro stuff and very specialized articles.

Jeffrey Stacey writes about Syria’s future (“intervening not now but later”), with a big potential role for the EU:

It is difficult to predict which way the current conflict in Syria will end up, as even some sort of stalemate could be the result.  But if opposition forces were ultimately successful in defeating Assad’s forces then it would be difficult for Western governments to ignore their shared security interests in the assurance of post-conflict stability in Syria.

Andrew Bertoli has a paper about nationalism and aggression, arguing that countries that qualify for the football/soccer World Cup behave more aggressively. German weekly Zeit has an interview with him (h/t Tobias Bunde).

Instead of lamenting the state of the German twitter- and blogosphere, let’s try and improve networking! So far, I had completely overlooked the blog “Junge UN Forschung”, written by members of the German junior researcher’s working group for UN studies (h/t Christian Kreuder-Sonnen).

Finally, Dan Drezner offers 15 examples of a world where book acknowledgments are really honest, such as:

I’m grateful to Peter Klugman, a Big Shot in my field who made a useful offhand comment to me once. People reading this will hopefully think I really know him and therefore be impressed.

Mathis Lohaus

Links: Voting reform, Forecasting, PRISM, Germany

Detail from “A Simple Proposal to Stop Gerrymandering”, Saturday Morning Breakfast Cereal

Summer break has begun in Germany. Wherever you are, enjoy your time in the sun! In case you’re stuck inside (or using a handheld device instead of just relaxing in the park), here are some links:

  • One of my favorite web comics has an episode on how to reform voting disctricts; it involves strict rules, is based on incentives and public scrutiny, and leaves little room for corruption.
  • The forecasting competition in which I take part (Good Judgment Project) is about to kick off season 3. I plan to cover the next steps here on the blog, in particular because I have now been promoted to “super forecaster” status. Please consider reading part 1 and part 2 of my coverage so far.
  • Edward Snowden’s fate is still undecided and the news about U.S./UK surveillance will probably keep going. For Germany, there is a new angle to the whole story in the aftermath of interior minister Friedrich’s visit to Washington: “many were critical of his trip, saying he was given little information and came across like an obedient school boy” (SPIEGEL).
  • Friedrich is now under fire for suggesting that several terrorist attacks on German soil have been avoided thanks to PRISM; a statement that was not backed up by facts. He also neatly summarized the ‘let’s give up civil liberties for counter-terrorism’ logic: “The noble intention of saving lives in Germany justifies working with our American friends and partners …” (my translation; via law blog)
  • Chancellor Merkel, on the other, is extremely careful not to say anything at all in her recent interviews on the topic.