IST Proposal MobiNews Meeting - June 10th, 2003 “Automatic and Personalised Compilation of Broadcast News with Audio Playback on Mobile Devices” François.

1 IST Proposal MobiNews Meeting - June 10th, 2003 “Automatic and Personalised Compilation of Broadcast News with Audio Playback on Mobile Devices” François CAPMAN, PhD Research Engineer, Technologies Radio & Signal Unit Tel : +33 (0) Fax : +33 (0) June 10th, 2003

2 MobiNews Workshop Agenda
10h h15 Agenda, objectives of the meeting 10h h30 Presentation of MobiNews IST proposal , current status 10h h30 Presentation of each organisation 1 (5mn/10mn) 11h h45 Break 11h h15 Presentation of each organisation 2 (5mn/10mn) 12h h45 Definition of contributions and overall structure of the project 12h h45 Lunch 13h h15 Detailed structure of the project, description of work-packages 15h h45 Other topics (additional partners, ...) 15h h00 Further steps, planning for the proposal 16h h30 Discussion - Conclusion

3 IST Objectives (2nd Call)
Call 2: publication 17/6 2003, closing 15/ – would have an indicative budget of around 525 MEuros (80 % pre-distributed). Objectives covered in Call 2 Advanced displays Optical, opto-electronic, & photonic functional components Open development platforms for software and services Cognitive systems Embedded systems Applications and services for the mobile user and worker (60 MEuros) Cross-media content for leisure and entertainment (55 MEuros) GRID-based Systems for solving complex problems Improving Risk management eInclusion  Specific Targeted Research Project (STREP) : 2.5 / 3.0 MEuros (Funding)

4 IST Objectives (2nd Call)
Cross-media content for leisure and entertainment Objective: To improve the full digital content chain, covering creation, acquisition, management and production, through effective multimedia technologies enabling multi-channel, cross-platform access to media, entertainment and leisure content in the form of film, music, games, news and alike. It will accelerate take up in B2B, B2C and C2C, currently hampered by insufficient productivity, convergence and high cost. Focus is on: – Developing technologies supporting the creation of new, compelling forms of content for interactive, creative or artistic consumption. Research should aim at advancing imaging technologies and audio-visual representation, multi-dimensional immersive environments and experience portals, as well as virtual, augmented and mixed reality technologies featuring higher levels of quality and accuracy. Device adaptivity and contextualisation, personalisation and (emotive) feedback, and ability to capture real-time, multimodal and multisensorial input will be embedded as needed. – Developing integrated content programming environments allowing to retrieve content from different sources, types and locations, and to store, compress and categorise it, with a view to realising programming appropriate to a particular audience and delivery channel, including interactive TV, e-cinema, radio, online games and music.

5 IST Objectives (2nd Call)
Applications and Services for the Mobile User and worker Objective: To foster the emergence of rich landscape of innovative applications and services for the mobile user and worker and to support the use and development of new work methods and collaborative work environments. These should be based on interoperable mobile, wireless technologies and the convergence of fixed and mobile communication infrastructures. Such applications and services will enable new business models, new ways of working, improved customer relations and government services in any context. The target applications and services will be capable of being seamlessly accessed and provided anywhere, anytime and in any context. Focus is on: – The integration of technologies into a wide range of innovative mobile and multimodal applications and services including workplace designs that enhance creativity and productivity. (Intelligent, adaptive and self-configuring services that deploy wearable interfaces and enable automatic context-sensitivity, user profiling and personalisation in a trusted and secure environment as well as multi-lingual and multi-cultural presentation, and multiple modes of interaction) – Addressing the major hurdles for the deployment of applications and services for the mobile user.

6 MobiNews Proposal Targeted Application Expected Features
Automatic compilation of broadcast news (audio, text) with audio playback on mobile devices (2.5G, 3G). Access to personally selected text and audio news from a service/source provider using Multimedia Messaging Service (MMS) transmission protocol. Expected Features Fast and reliable access to synthetic newscast on a regular basis (daily, weekly, …) or upon request. Access to various identified sources within the same compilation, using scheduled programme. Automatic server-based generation of the synthetic newscast, with MMS WAP 2.0 Low-cost transmission towards mobile devices. User-defined profile for automatic download Enhanced Man Machine Interface (MMI) for queries’ submission, key-word-based search, ...

7 MobiNews Proposal Technical Objectives
Audio data and Text data Structuring: automatic / semiautomatic segmentation (speaker tracking, scheduled programme, …) classification, discrimination (speech, music, jingles, …) transcription and information retrieval (word-spotting, key-words, …) automatic summarisation Very Low Bit Rate (VLBR) Wide-Band speech compression (with optional scalable audio stage). Text-To-Speech (TTS) synthesis for audio display of the transmitted text component (optional voice conversion, style / prosody mimicking). Software optimisation (complexity and memory) of VLBR decoder and TTS modules for embedded solutions on mobile devices (downloadable as plug-ins). Enhanced interface for mobile products (Natural Language Processing (NLP), …) Demonstrator with MMS link between a PC-based server and a handheld mobile terminal.

8 MobiNews Proposal

9 MobiNews Proposal

10 VLBR compression for MobiNews
Targeted duration: 10 to 15 minutes in one single MMS  VLBR between 800 and 1200 bits/sec

11 MobiNews Work Packages
Definition of Work Packages WP 1 Project management WP 2 Analysis of the needs, analysis of the market, dissemination WP 3 Broadcast radio news databases (specifications, collect, recordings) WP 4 Audio and text data structuring WP 5 Very-Low Bit Rate (VLBR) compression for synthetic newscast WP 6 Text-To-Speech (TTS) synthesis for mobile devices WP 7 MMS-based demonstrator (Server and mobile applications, MMI, …) WP 8 Evaluation methodology, field trials, analysis

12 MobiNews Consortium Thales Communications (France) L.I.A. (France)
E.N.S.T. (France) E.S.I.E.E. (France) Elan Speech (France) Brno University of Technology (Czech Republic) Multitel (Belgium) INESC-ID (Portugal) PT Inovação, Voice services and platforms Dept (Portugal) Radio France Multimedia (France) Belga Press Agency (Belgium) Portuguese Radio/TV (Portugal) ???

13 Presentation of organisations
 General Presentation and Potential Contributions to MobiNews 1 - Gwenaël Guilmin (Thales Communications) 2 - Bertrand Ravera : RNRT project proposal Mobi-Info 2 - Corinne Fredouille (L.I.A.) 3 - Maurice Charbit (E.N.S.T.) 4 - Geneviève Baudoin (E.S.I.E.E.) 5 - Jacques Toën (ELAN SPEECH) 6 - Petr Motlicek (BRNO University of Technology) 7 - Stéphane Deketelaere (MULTITEL) 8 - Isabel Trancoso (INESC-ID) 9 - Nuno Beires (PT INOVACAO) 10 - Caroline Roy (RADIO France MULTIMEDIA)

14 Contributions Thales Communications: E.N.S.T.: E.S.I.E.E.:
Speech segmentation / classification Very-Low Bit Rate speech compression using parametric approaches optimisation of VLBR for a mobile plug-in E.N.S.T.: voice conversion using improved HNM synthesis, joint-optimisation of speech units for coding and synthesis E.S.I.E.E.: Very Low Bit Rate speech compression using recognition/synthesis Very Low Bit Rate speech compression using parametric approaches voice conversion BRNO University of Technology:

15 Contributions ELAN SPEECH: INESC-ID, and L.I.A.: MULTITEL:
distributed architecture (mobile/server) for speech synthesis optimisation for a mobile plug-in voice personalization, voice conversion INESC-ID, and L.I.A.: audio data structuring MULTITEL: Man-Machine Interface, Natural Language Processing PT INOVACAO: MMS synthetic newscast packaging MMS-based demonstrator Radio France Multimedia, and Belga Press Agency (+ Portuguese TV/rad) specifications news content provider evaluation

16 MobiNews: WORKPLAN WP 2: Analysis of the market, … needs, dissemination: WP2.1: Analysis of the market: existing services WP2.2: Analysis of the needs: limitations of the existing services WP2.3: Dissemination: valorisation of the outcome of the project, standardisation, ...

17 MobiNews: WORKPLAN WP 3: Broadcast radio news databases
WP3.1: Audio databases (collect, recordings, annotation, meta-data, …) WP3.2: Text databases (collect, annotation, meta-data, …) WP3.3: Service specifications (features, user acceptance, …)

18 MobiNews: WORKPLAN WP 4: Audio and Text data Structuring
WP4.1: Low-level segmentation speech/non speech discrimination (silence, noise, pause, speech, music, jingle, …) speaker characterisation (identification, tracking, segmentation, clustering, …) WP4.2: High-level segmentation speech-to-text transcription story segmentation, topic detection, tracking and classification WP4.3: Customisation text summarisation, audio summarisation constrained summarisation (profile-driven, queries-driven, duration, multi-sources, …) meta-data information evaluation methodology (reference human-built summaries, quiz scores, …)

19 MobiNews: WORKPLAN WP 5: VLBR Speech / Audio compression
WP5.1: Segmental-based parametric compression of synthetic newscast audio stream analysis and segmentation optimised compression of structured messages scalable solutions (bit-rate and bandwidth) WP5.2: Compression based on natural speech units indexing optimised HNM-based speech synthesis speaker-independent mode (speaker adaptation, voice conversion) joint-optimisation of units for both synthesis and coding compression of synthesis units for memory storage optimisation

20 MobiNews: WORKPLAN WP 6: Text-To-Speech synthesis for mobile devices
WP6.1: Voice conversion / customisation WP6.2: Optimisation for mobile terminals complexity reduction memory storage distributed software architecture

21 MobiNews: WORKPLAN (Man Machine Interface)
WP 7: User-centred design of the MMI (Man Machine Interface) WP7.1: Server-based application optimised entries for the definition of user profile, user queries, ... WP7.2: Mobile embedded application design of an efficient mobile interface with emphasis on the ease-of-use and the acceptability (= usability)

22 MobiNews: WORKPLAN WP 8: MMS-based demonstrator
WP8.1: Server-based applications module for data structuring module for audio compression MMS packaging WP8.2: Mobile devices embedded applications MMS de-packaging optimised plug-in for text-to-speech synthesis optimised plug-in for audio decompression

23 MobiNews: WORKPLAN WP 9: Evaluation methodology, Field trials, Analysis WP9.1: Evaluation methodologies audio quality for speech synthesis and compression evaluation of synthetic newscast (summarisation) evaluation of MMI (queries, profile, …) WP9.2: Field trials and analysis quiz score methods … ?

24 Administrative Issues
the project proposal will include: A1 form: proposal acronym, proposal number, proposal title, estimated duration (30 months ?), key word codes, abstract (co-ordinator) A2 form: participant submission form (for each participant) A3 form:financial information (co-ordinator) B part: non-anonymous description of scientific/technological objectives

