Archive for University of Warwick

Au’Bayes 17

Posted in Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , on December 14, 2017 by xi'an

Some notes scribbled during the O’Bayes 17 conference in Austin, not reflecting on the highly diverse range of talks. And many new faces and topics, meaning O’Bayes is alive and evolving. With all possible objectivity, a fantastic conference! (Not even mentioning the bars where Peter Müller hosted the poster sessions, a feat I would have loved to see duplicated for the posters of ISBA 2018… Or the Ethiopian restaurant just around the corner with the right amount of fierce spices!)

The wiki on objective, reference, vague, neutral [or whichever label one favours] priors that was suggested at the previous O’Bayes meeting in Valencià, was introduced as Wikiprevia by Gonzalo Garcia-Donato. It aims at classifying recommended priors in most of the classical models, along with discussion panels, and it should soon get an official launch, when contributors will be welcome to include articles in a wiki principle. I wish the best to this venture which, I hope, will induce O’Bayesians to contribute actively.

In a brilliant talk that quickly reverted my jetlag doziness, Peter Grünwald returned to the topic he presented last year in Sardinia, namely safe Bayes or powered-down likelihoods to handle some degree of misspecification, with a further twist of introducing an impossible value `o’ that captures missing mass (to be called Peter’s demon?!), which absolute necessity I did not perceive. Food for thoughts, definitely. (But I feel that the only safe Bayes is the dead Bayes, as protecting against all kinds of mispecifications means no action is possible.)

I also appreciated Cristiano Villa’s approach to constructing prior weights in model comparison from a principled and decision-theoretic perspective even though I felt that the notion of ranking parameter importance required too much input to be practically feasible. (Unless I missed that point.)

Laura Ventura gave her talk on using for ABC various scores or estimating equations as summary statistics, rather than the corresponding M-estimators, which offers the appealing feature of reducing computation while being asymptotically equivalent. (A feature we also exploited for the regular score function in our ABC paper with Gael, David, Brendan, and Wonapree.) She mentioned the Hyvärinen score [of which I first heard in Padova!] as a way to bypass issues related to doubly intractable likelihoods. Which is a most interesting proposal that bypasses (ABC) simulations from such complex targets by exploiting a pseudo-posterior.

Veronika Rockova presented a recent work on concentration rates for regression tree methods that produce a rigorous analysis of these methods. Showing that the spike & slab priors plus BART [equals spike & tree] achieve sparsity and optimal concentration. In an oracle sense. With a side entry on assembling partition trees towards creating a new form of BART. Which made me wonder whether or not this was also applicable to random forests. Although they are not exactly Bayes. Demanding work in terms of the theory behind but with impressive consequences!

Just before I left O’Bayes 17 for Houston airport, Nick Polson, along with Peter McCullach, proposed an intriguing notion of sparse Bayes factors, which corresponds to the limit of a Bayes factor when the prior probability υ of the null goes to zero. When the limiting prior is replaced with an exceedance measure that can be normalised into a distribution, but does it make the limit a special prior? Linking  υ with the prior under the null is not an issue (this was the basis of my 1992 Lindley paradox paper) but the sequence of priors indexed by υ need be chosen. And reading from the paper at Houston airport, I could not spot a construction principle that would lead to a reference prior of sorts. One thing that Nick mentioned during his talk was that we observed directly realisations of the data marginal, but this is generally not the case as the observations are associated with a given value of the parameter, not one for each observation.The next edition of the O’Bayes conference will be in… Warwick on June 29-July 2, as I volunteered to organise this edition (16 years after O’Bayes 03 in Aussois!) just after the BNP meeting in Oxford on June 23-28, hopefully creating the environment for fruitful interactions between both communities! (And jumping from Au’Bayes to Wa’Bayes.)

AlphaGo [100 to] zero

Posted in Books, pictures, Statistics, Travel with tags , , , on December 12, 2017 by xi'an

While in Warwick last week, I read a few times through Nature article on AlphaGo Zero, the new DeepMind program that learned to play Go by itself, through self-learning, within a few clock days, and achieved massive superiority (100 to 0) over the earlier version of the program, which (who?!) was based on a massive data-base of human games. (A Nature paper I also read while in Warwick!) From my remote perspective, the neural network associated with AlphaGo Zero seems more straightforward that the double network of the earlier version. It is solely based on the board state and returns a probability vector p for all possible moves, as well as the probability of winning from the current position. There are still intermediary probabilities π produced by a Monte Carlo tree search, which drive the computation of a final board, the (reinforced) learning aiming at bringing p and π as close as possible, via a loss function like

(z-v)²-<π, log p>+c|θ

where z is the game winner and θ is the vector of parameters of the neural network. (Details obviously missing above!) The achievements of this new version are even more impressive than those of the earlier one (which managed to systematically beat top Go players) in that blind exploration of game moves repeated over some five million games produced a much better AI player. With a strategy at times remaining a mystery to Go players.

Incidentally a two-page paper appeared on arXiv today with the title Demystifying AlphaGo Zero, by Don, Wu, and Zhou. Which sets AlphaGo Zero as a special generative adversarial network. And invoking Wasserstein distance as solving the convergence of the network. To conclude that “it’s not [sic] surprising that AlphaGo Zero show [sic] a good convergence property”… A most perplexing inclusion in arXiv, I would say.

amber warning

Posted in Statistics with tags , , , , , , on December 10, 2017 by xi'an

Just saw this severe Met warning of snow over Warwickshire and neighbouring counties… The campus is indeed covered with snow, but not that heavily. Yet. (It is comparatively mild in Austin, Texas, even though the icy wind turned my fingers to iciles during my morning run there!)

Amber warning of snow

From: 0810 on Sun 10 December
To: 1800 on Sun 10 December
Updated 6 hours ago Active

A spell of heavy snow is likely over parts of Wales, the Midlands and parts of Northern and Eastern England on Sunday.

Road, rail and air travel delays are likely, as well as stranding of vehicles and public transport cancellations. There is a good chance that some rural communities could become cut off.This is an update to extend the warning area as far south as Gloucestershire, Wiltshire, Oxfordshire, Buckinghamshire, Hertfordshire and Essex.

foundations of probability

Posted in Books, Statistics with tags , , , , on December 1, 2017 by xi'an

Following my reading of a note by Gunnar Taraldsen and co-authors on improper priors, I checked the 1970 book of Rényi from the Library at Warwick. (First time I visited this library, where I get very efficient help in finding and borrowing this book!)

“…estimates of probability of an event made by different persons may be different and each such estimate is to a certain extent subjective.” (p.33)

The main argument from Rényi used by the above mentioned note (and an earlier paper in The American Statistician) is that “every probability is in reality a conditional probability” (p.34). Which may be a pleonasm as everything depends on the settings in which it is applied. And as such not particularly new since conditioning is also present in e.g. Jeffreys’ book. In this approach, the definition of the conditional probability is traditional, if restricted to condition on a subset of elements from the σ algebra. The interesting part in the book is rather that a measure on this subset can be derived from the conditionals. And extended to the whole σ algebra. And is unique up to a multiplicative constant. Interesting because this indeed produces a rigorous way of handling improper priors.

“Let the random point (ξ,η) be uniformly distributed over the whole (x,y) plane.” (p.83)

Rényi also defines random variables ξ on conditional probability spaces, with conditional densities. With constraints on ξ for those to exist. I have more difficulties to ingest this notion as I do not see the meaning of the above quote or of the quantity

P(a<ξ<b|c<ξ<d)

when P(a<ξ<b) is not defined. As for instance I see no way of generating such a ξ in this case. (Of course, it is always possible to bring in a new definition of random variables that only agrees with regular ones for finite measure.)

A of A

Posted in Books, Kids, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on November 30, 2017 by xi'an

Next June, at the same time as the ISBA meeting in Edinburgh, which is slowly taking shape, there will be an Analysis of Algorithms (AofA) meeting in Uppsala (Sweden) with Luc Devroye as the plenary Flajolet Award speaker. The full name of the conference is the 29th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms. While it is unfortunate the two conferences take place at the same time (and not in the same location), this also provides a continuity of conferences with the following week MCqMC in Rennes and the subsequent week summer school in simulation in Warwick (with Art Owen as the LMS Lecturer).

About our summer school, I want to point out that, thanks to several sponsors, we will be able to provide a consequent number of bursaries for junior researchers. This should be an additional incentive for attendees of the previous week Young Bayesian meeting (BAYSM) to remain the extra days nearby Warwick and attend this fantastic opportunity. Other instructors are Nicolas Chopin, Mark Huber and Jeff Rosenthal!

importance demarginalising

Posted in Books, Kids, pictures, Running, Statistics, Travel, University life with tags , , , , , on November 27, 2017 by xi'an

A question on X validated gave me minor thought fodder for my crisp pre-dawn run in Warwick the other week: if one wants to use importance sampling for a variable Y that has no closed form density, but can be expressed as the transform (marginal) of a vector of variables with closed form densities, then, for Monte Carlo approximations, the problem can be reformulated as the computation of an integral of a transform of the vector itself and the importance ratio is given by the ratio of the true density of the vector over the density of the simulated vector. No Jacobian involved.

four positions at Warwick Statistics, apply!

Posted in Statistics with tags , , , , , , , on November 21, 2017 by xi'an

Enthusiastic and excellent academics are sought to be part of our Department of Statistics at Warwick, one of the world’s most prominent and most research active departments of Statistics. We are advertising four posts in total, which reflects the strong commitment of the University of Warwick to invest in Statistics. We intend to fill the following positions:

  • Assistant or Associate Professor of Statistics (two positions)

  • Reader of Statistics

  • Full Professor of Statistics.

All posts are permanent, with posts at the Assistant level subject to probation.

You will have expertise in statistics (to be interpreted in the widest sense and to include both applied and methodological statistics, probability, probabilistic operational research and mathematical finance together with interdisciplinary topics involving one or more of these areas) and you will help shape research and teaching leadership in this fast-developing discipline. Applicants for senior positions should have an excellent publication record and proven ability to secure research funding. Applicants for more junior positions should show exceptional promise to become leading academics.

While the posts are open to applicants with expertise in any field of statistics (widely interpreted as above), the Department is particularly interested in strengthening its existing group in Data Science. The Department is heavily involved in the Warwick Data Science Institute and the Alan Turing Institute, the national institute for data science, headquartered in London. If interested, a successful candidate can apply to spend part of their time at the Alan Turing Institute as a Turing Fellow.

Closing date: 3 January 2018 for the Assistant/Associate level posts and 10 January 2018 for the Full Professor position.

Informal enquires can be addressed to Professors Mark Steel, Gareth Roberts, and David Firth or to any other senior member of the Warwick Statistics Department. Applicants at Assistant/Associate levels should ask their referees to send letters of recommendation by the closing date to the Departmental Administrator, Mrs Paula Matthews.