Document Actions

2012 results

Modeling and estimation framework (S. Raczynski, S. Fukayama, E. Vincent, S. Sagayama)
Our past work provided a proof of concept of the linear interpolation paradigm for the smoothing of single-feature distributions and of the log-linear interpolation paradigm for the integration of multiple features into a joint distribution. In order to further validate this approach, we compared linear vs. log-linear interpolation for the latter task and showed that log-linear interpolation was indeed superior (see preprints below).

Software and evaluation infrastructure (D. Moreau, S. Raczynski, J. Thiemann, N. Ito, S. Fukayama, E. Vincent, S. Sagayama)
Our work on temporal organization modeling and symbolic modeling led to two more pieces of software that we released under the GPL (automatic melody harmonization software, multiple pitch estimation software). This software incorporates the basic modules of smoothing, interpolation and junction tree decoding needed to address other tasks involving the integration of multiple features into a joint distribution.

In parallel, we recruited D. Moreau as an R&D engineer to investigate and develop the necessary tools towards automatic collection of large-scale music corpora. This issue is crucial, as the impact of the unsupervised statistical approach which we adopted is expected to be even bigger when large datasets become available instead of the small (often handcrafted) datasets which are available today. Given a composer name and a piece title, the tools he has developed so far allow the identification of relevant symbolic music data files (MIDI, leadsheets, lyrics, etc) from a number of online music archives and the automatic alignment of MIDI and leadsheet files so as to confirm whether they actually correspond to the same song. Starting from the categorization of music variables that we published in 2010, he has also proposed a typology of these variables which may be used as the basis for a future all-in-one XML file format.

Finally, in the scope of the work of N. Ito on diffuse noise modeling, we recorded a database a real-world noise environments and released it under a Creative Commons license (corpus). This database is the only one that features both a large number of microphones (16) and environments (15).

Symbolic modeling (S. Raczynski, S. Fukayama, E. Vincent, S. Sagayama)
We finalized our past work on the log-linear interpolation paradigm and submitted two journal papers about its applications: one about the joint modeling of "horizontal" and "vertical" dependencies between notes applied to polyphonic pitch estimation (preprint) and another one about the joint modeling of melody, key and chords applied to melody harmonization (preprint). We recently started working on the modeling of genre-dependent melody and chord sequence models. Contrary to speech where languages are well defined and distinct from each other, music genres span a continuum of styles. We aim to model this infinite number of possible genres via suitably defined "topic models".

Acoustic modeling and segregation (N. Ito, E. Vincent, N. Ono)
N. Ito defended his PhD thesis at the University of Tokyo in January (thesis). We summarized our general statistical framework for the modeling of diffuse noise and the associated denoising algorithms into a journal paper which will shortly be submitted (preprint).

Events and funding (S. Raczynski, G. Sargent, N. Ono, E. Vincent, H. Kameoka, F. Bimbot, S. Sagayama)
We organized a final one-day workshop in Tokyo, which was co-sponsored by the Japan Chapter of the IEEE Signal Processing Society. The workshop involved 8 presentations by members of the project and 6 presentations by external attendees. Besides the presenters and other students from the University of Tokyo, it attracted 7 more external attendees, mostly from Japanese companies.

Following his successful PhD defense, N. Ito was immediately recruited as a researcher by NTT CS Labs.

N. Ono submitted a project proposal in response to the NII Grand Challenge call which was accepted. The project, which will run from April 2012 to March 2013, involves NII, Tokyo Institute of Technology, the University of Tsukuba and E. Vincent as an international collaborator. The budget available to METISS is 500.000 yen for travel and to recruit an intern. The goal of the project is to conduct baseline studies on the topic of ad-hoc microphone arrays, so as to identify the most promising research directions in the perspective of jointly submitting a larger-scale ANR+JSTproject when the ANR+JST program re-opens in the future.

Following the submission of our joinly developed structural music segmentation software to the MIREX 2011 evaluation campaign, the American company SmartSound Software Inc. contacted us in order to test and possibly acquire a license of this software. A non-disclosure agreement for testing purposes is currently being signed between the company and Inria (together with CNRS and Université de Rennes 1).

VERSAMUS

Sections

Personal tools

Document Actions

2012 results

2012 results