@alejandro | Good evening everybody. |
@alejandro | Now it's time for Michael Meeks presentation about OpenOffice. |
@alejandro | Michael Meeks works in Novell as internal developer and has contributed to GNOME development in the last years with Bonobo and Orbit. |
@alejandro | Now his work is focused mainly on the OpenOffice suite. |
@alejandro | Translations to spanish will be in #redes and questions or comments in #qc |
@alejandro | Welcome Michael, thanks for your assistance. |
@michael | thank you alejandro, |
@michael | so - today I'd like to talk about OpenOffice.org (OO.o for short) - from 3 angles: |
@michael | + An overview |
@michael | + Current development |
@michael | + Future stuff |
@michael | . |
@michael | please do interrupt with questions in #qc, and forgive my (potentially) irritating IRC mannerisms such as blank lines with just a period. |
@michael | . |
@michael | So: OO.o an Overview: |
@michael | OO.o is a big source base, estimates vary but I'm sticking to 8million lines |
@michael | [ incidentally my spelling is terrible, but no doubt that'll get fixed in translation ;-] |
@michael | people often wonder why it's so 'big' |
@michael | it feels big (slow to start), and its 'different' implementing lots of technologies again - the toolkit: VCL, a VFS (UCB) etc. etc. |
@michael | I'd like to persuade you that many of these things have good pragmatic reasons: eg. historical - it's a 25+year old project |
@michael | or eg. wrt. complexity - there are many constraints - backwards compatibility eg. or interoperability - laying out a document in the same way that it is viewed in MS Word eg. |
@michael | and then of course - there is the sheer depth of features accumulated over many years & maintained. |
@michael | . |
@michael | so - yes, OO.o is an apparently large beast - but it is nicely componentized, |
@michael | this has some benefits & some demerits as we'll see when we look at performance. |
@michael | but the advantage is that ideally you shouldn't need to load much more code than you're going to need. |
@michael | thus OO.o should be fast - but more of this later. |
@michael | . |
@michael | One of the most interesting things (to me & Novell) about OO.o is that it is the most capable Office suite in the Free software space. |
@michael | It gives MS Office a good run for it's money - indeed, to my mind there are really only 2 interesting office suites: MS Office and OO.o. |
@michael | It is also quite interesting (although I would not propose that people use a non-Free O/S) that we can compete with MS Office (MSO) on Win32. |
@michael | Indeed - interestingly OO.o performs rather better on Win32 than Linux for several reasons |
@michael | . |
@michael | So why is OO.o strategically important ? and why do I think you can make a big difference on this project ? ;-) |
@michael | + MS earn 30% of their turnover from 'Office' [source The Economist] |
@michael | + the existing community is small - it's easy to make an impact by getting involved. |
@michael | + we have several millions of users: many on Win32. |
@michael | . |
@michael | this last point - the user base is critical: 'everyone' knows that Free software is wonderful right ? |
@michael | but in fact - that knowledge is really quite isolated, it may be that many graduate-level people know about software Freedom & how important it is - but (at least in the west) very few 'normal' people do :-) |
@michael | OO.o in this case - is a spearhead for Free software, yes on the Win32 desktop - but genuinely useful, easy to install Free software, |
@michael | and - so - sitting on the train, at the coffee shop, meeting random non-technical people - they have heard of OO.o, though GNU/Linux may be unfamiliar to them. |
@michael | . |
@michael | hence - by improving OO.o by 0.1% - the net effect over so many millions of people is vast, |
@michael | particularly wrt. the image & reputation of Free software in the world at large. |
@michael | . |
@michael | so; |
@michael | perhaps enough about that |
@michael | developers should poke at the developer wiki: http://wiki.services.openoffice.org/wiki/Main_Page |
@michael | I think you get my passion for Free software & OO.o in particular :-) |
@michael | . |
@michael | So; |
@michael | [ no questions so far ... ] |
@michael | There are a number of interesting new things in OO.o 2.0 |
@michael | many of them are already known since they were back-ported to the OO.o 1.1.x releases & are thus assumed to 'be there' - whereas in reality they just got released up-stream. |
@michael | Other interesting new things: |
@michael | Improved ergonomics - (still plenty of room for further improvement here though) |
@michael | + there is a completely new UI for impress - which was much needed, it's far easier to create slides & use the full set of transitions |
@michael | + lots of other fixes: new toolbar docking, also the much requested 'format copy brush' feature. |
@michael | . |
@michael | if you're a developer like me - you prolly find you almost never use OO.o yourself - so perhaps these are unfamiliar. |
@michael | . |
@michael | another thing is the 'base' component - this moves a load of (existing) database functionality into 1 place, |
@michael | and provides a nice unified shell for it, along with a 'database file' familiar to those who have used 'Access' |
@michael | other things you might not notice - under the hood a big chunk of old cruft got removed, |
@michael | the StarOffice 2.0->5.0 filters were ripped out of the core code & pushed into a 'binfilters' plugin, and are now no longer loaded / used unless you require them |
@michael | other bits: scattered interoperability improvements, native installers (RPM, MSI etc.), XForms support, and of course the OpenDocument format. |
@michael | . |
@michael | wrt. how the project works: all OO.o code has a unified copyright holder (which is Sun Microsystems) |
@michael | they achieve this by a 'Joint Copyright Assignment' whereby (IANAL) you continue to own the copyright - but so does Sun |
@michael | from #qc: |
@michael | <amd> try =GAME('StarWars') in calc2 |
@michael | right :-) or try StarWriterTeam<F3> in writer, |
@michael | there are some fun people to work with there :-) |
@michael | so - copyright assignment aside - the code is LGPL - this of course allows propriatory binary plugins to be written vs. the public API - but also other GPL incompatible licenses to be used, |
@michael | Sun contributes most heavily to development - perhaps 80%+ of the programmer resources - though over time - as more companies & individuals get involved: Novell, Red Hat, Intel, Google etc. |
@michael | this balance is changing towards a more healthy balance of contribution. |
@michael | Sun still have many processes that are over-formal and unfamiliar to new developers - so, it's worth grabbing people on IRC. #go-oo,#OpenOffice.org irc.freenode.net when you get problems |
@michael | . |
@michael | (developers only for those channels please). |
@michael | . |
@michael | so - onto 'Current developments' |
@michael | . |
@michael | One of the best things that has happened in OO.o recently is no software related, |
@michael | it is the switch to a time-based release schedule; |
@michael | this innovation is of course familiar to followers of the Linux Kernel, GNOME etc. |
@michael | however - OO.o has previously worked on an 18month release cycle, which tends to kill momentum around the project, and ensure that your new/cool feature is burried for 1/2 your student lifetime before seeing the users. |
@michael | so - switching to a 3/6 month time-based release schedule is a major revolution & improvement, |
@michael | so - I'm excited that more regular OO.o feature drops are coming, helping vendors, users and developers alike. |
@michael | . |
@michael | Another interesting new thing is the 'OpenDocument' format, |
@michael | this is of course a good thing in some sense; however - there is a real danger for Free software here; |
@michael | the danger is this: We developers produce code: preferably lots of it, so we understand & value software freedom, |
@michael | Politicians on the other hand, do not produce code; |
@michael | they -do- however produce document data. |
@michael | . |
@michael | so - they can understand & communicate this "I don't own my own data", "why do I need a patent license to read my own thesis" type arguments. |
@michael | so - this is all well and good, the problem is - that OpenStandards are not OpenSource [sorry but the aliteration got me ;-] |
@michael | so - for people overly-excited about OpenDocument, and/or any particular sales opening at the moment - I would calm down - take some pain-killers and work harder at explaining why the Free software is more important than the standard in the long run. |
@michael | . |
@michael | anyhow; |
@michael | all this is the end-user blurb. |
@michael | . |
@michael | now onto a topic close to my heart - performance. |
@michael | Why is OO.o so slow to startup ? |
@michael | or put another way by a friend: "gnumeric starts up fully before OO.o renders the splash-screen" :-) |
@michael | . |
@michael | so - there are really 3 parts to performance: |
@michael | + warm start, + cold start & + document load. |
@michael | . |
@michael | + warm start: |
@michael | So - 1/3rd to 1/2 (depending on how fast your machine is [ faster == 1/2 ]) of the OO.o warm start time is linking. |
@michael | If you want to understand why linking is so slow read: |
@michael | Ulrich Drepper's paper: http://people.redhat.com/drepper/dsohowto.pdf |
@michael | the root problem here is one of design - ELF specifies a feature called 'interposing', |
@michael | this feature is re-used to implement the 'LD_PRELOAD' functionality loved by some, |
@michael | however it is much more insidious; eg. consider C++ [ and much of the problem stems from C++'s language design sadly] |
@michael | eg. if you 'throw' an exception - you can never be sure that it will be caught, |
@michael | perhaps you are the only user of this exception in the whole world, |
@michael | => you have to output type information to describe the exception. |
@michael | _ZTI12FooException... |
@michael | of course - it's quite possible the exception is caught by some other shared library (DSO) that was unknown at compile time, |
@michael | but then - perhaps it was never thrown => the catcher must also emit this exception information, |
@michael | fine so far ? |
@michael | so - the problem comes when the exception is caught - to see if we caught a 'FooException' not a 'BaaException' we compare type information by pointer value, |
@michael | ergo - while we have 2 _ZTi11FooException symbols - it's necessary for them both to resolve to the same value. |
@michael | this is achieved by interposing, |
@michael | essentially - the first symbol in the search list 'wins' - and all other references to that are hidden by it. |
@michael | . |
@michael | so - for 1 exception this is fine - you just search the symbol tables of all libraries to find the authoritative version & use that => no problem. |
@michael | . |
@michael | the problem is that there are many, many instances of this - many tens of thousands of symbols, |
@michael | all of which have to be looked up very slowly indeed - linking takes eg. 2.5 seconds of raw CPU time (of 5) on my 2.6GHz 512k cache desktop, |
@michael | this is mainly because of cache effects - searching the 150 DSO's that make up OO.o for a given symbol hammers your L2 cache for virtually every library & ever symbol. |
@michael | . |
@michael | so - |
@michael | worse 'prelink' cannot be used - since as I mentioned, OO.o is nicely componentised into shared libraries, dynamically loaded as needed, & prelink doesn't handle dlopening libraries. |
@michael | . |
@michael | so - I've been working on a feature called '-Bdirect' that implements a far more efficient linking algorithm for the majority of non 'vague' (eg. exception type information) symbol |
@michael | this saves 75% of the linking time - giving a very substantial speedup - with more to come, |
@michael | so; |
@michael | hopefully the warm startup speed problems are being tackled nicely :-) [ work is ongoing here with various other optimisations ] |
@michael | . |
@michael | <ThomasWal> concerning performance of OO.o: i dont care much about startup time as I usually use it for a much longer time than it needs to startup |
@michael | so - yes, this is a common point :-) |
@michael | why is startup time important; I guess as you point out it's mostly an aesthetic thing for developers. |
@michael | but - it's also responsiveness, |
@michael | eg. a chart in your slide-show: to render that you have to load calc - which requires slow linking & stops you flipping slides until that's done. |
@michael | . |
@michael | so. |
@michael | performance 2: cold start - the difference between warm & cold is only disk I/O |
@michael | Linux' disk I/O is not as intelligent & predictive as Win32's - it's way slower on cold start on Linux, |
@michael | on 1 laptop: 12 seconds cold start, vs ~5 seconds warm start, |
@michael | . |
@michael | clearly reducing the amount of memory required & the number of files touched helps here & work is ongoing there - with some nice wins in the can for 2.0.1 / 2.0.2. |
@michael | . |
@michael | and the 3rd part - document load performance, - again algorithmic improvements here are necessary, |
@michael | and help profiling & fixing / finding sillies here much appreciated - plenty of scope for easy wins. |
@michael | . |
@michael | so - moving on quickly; |
@michael | there are some particularly nice things being developed at the moment I'd like to share with you: |
@michael | + VBA/Calc, + Mono & Cairo integration |
@michael | [ in my team - of course, a lot more is happening across the board ] |
@michael | . |
@michael | VBA macro support in calc has been a focus of our work for some time: |
@michael | http://www.gnome.org/~michael/hypocycloid-thumb.jpeg http://www.gnome.org/~michael/hypocycloid.jpeg |
@michael | checkout the wiki http://wiki.services.openoffice.org/wiki/VBA for more details, |
@michael | . |
@michael | it turns out that while this is an almost impossible task - it's possible to get a long way without a complete solution & give a good result - with lots of macros becoming usable without a perfect solution, |
@michael | most macros it seems are written by cut/paste of the macro recorder output. |
@michael | . |
@michael | another thing - Mono integration, |
@michael | OO.o exports a large & pleasant API for programmers - what better than to expose that for use in Mono: |
@michael | http://go-oo.org/~michael/mono-uno-thumb.png http://go-oo.org/~michael/mono-uno.png |
@michael | . |
@michael | that should be shipping in more modern distros by now I hope, though it requires some polish to get Debian acceptance; those interested ping me. |
@michael | . |
@michael | another interesting thing is Cairo integration - improving the visual look of the rendering: |
@michael | http://rodo.foo.cz/blog/images/smooth-curves-co.jpg http://rodo.foo.cz/blog/images/smooth-curves-vcl-noaa-co.jpg |
@michael | . |
@michael | this really applies only to the slideshow at the current time - since rendering in the application needs to be ported to the new Canvas implementation to use cairo (which is a large job). |
@michael | cairo & Xrender themselves also require performance work to make this work well - although luckily the Canvas has pluggable implementations to allow the existing impl. to be used if things are slow for you. |
@michael | . |
@michael | Of this - I'm personally most excited by the VBA macro issue - since, many of our customers want it - and there are umpteen millions of lines of existing (simple) VBA macro code out there built into people's businesses - expense reporting forms eg. |
@michael | . |
@michael | So - quickly - some thoughts for the future: |
@alejandro | Ulrich Drepper's paper: http://people.redhat.com/drepper/dsohowto.pdf |
@michael | + the OS/X port - needs man-power, but of course for a non-Free system |
@michael | + MS' new document format - confusingly the 'Office Open XML' format vs. the 'Open Office XML' format (OpenDocument) :-) |
@michael | one of the big areas of challenge & improvement is calc |
@michael | the spreadsheet is one of the weakest parts of OO.o, it scales poorly compared with MS Office, and hinders people moving to a GNU/Linux platform on that basis |
@michael | (as our Novell / internal deployment feedback shows) |
@michael | . |
@michael | this is going to get far worse when MS Office expands the spreadsheet row/column limits: |
@michael | we go from 64k -> 1million rows and 256 -> 16k columns |
@michael | you can see that if we're using an O(N^3) in number-of cells algorithm already vs. MS's O(N*log(N)) and we increase the limits as above |
@michael | then we look very substantially worse; |
@michael | so there is a lot of efficiency saving, and optimisation work to be done in calc - along with a big chunk of interoperability improvement. |
@michael | nothing that's too difficult by itself, ie. it should parallelise nicely - but lots of fruitful work. |
@michael | . |
@michael | another new thing coming - is a new charting engine 'chart2' that implements a far fuller set of charting primitives, |
@michael | possibly that will arrive in the 2.0.x series - hopefully so. |
@michael | similarly - an under-developed feature is 'base' - eg. importing MS Access' .mdb files by integrating with the existing fine mdbtools project is a no-brainer, |
@michael | there is currently a prototype converter - but lots of room for importing templates, macros, reports etc. to help people move to OO.o |
@michael | <amd> do you maybe happen to know Gnumeric's number-in-cells algorithm cost? |
@michael | amd: nope - the order above is an exageration / example :-) |
@michael | amd: but certainly gnumeric is way more efficient at loading Excel files, however it's also substantially less feature rich than calc, |
@michael | amd: the good news is that we have Jody Goldberg (gnumeric maintainer) working on Calc (as well) for me, full-time on improving Calc's interop & performance, so - I'm optimistic this gap will close fast, |
@michael | amd: of course - more help is always good :-) |
@michael | finally, |
@michael | + ergonomics & UI need improvement |
@michael | the ergonomics today are pretty shockingly bad, |
@michael | we need a metric bus-load of polish applied all-over, |
@michael | this is often a matter of arguing with the Sun UI team however who provide rather a bottleneck here. |
@michael | however - 1 revolutionary approach, using the UNO API that woudl by-pass them is possible; |
@michael | there is a nice prototype of the use of XUL for the OO.o UI, |
@michael | this is -revolutionary- not because it embeds a chunk of Gecko into OO.o, |
@michael | or allows sensible widget layout (as any modern toolkit does) |
@michael | but because it would allow the 'logic' for a dialog to be split from the core, |
@michael | into the XUL/Javascript itself, |
@michael | (using the UNO APIs & the existing UNO/XP-COM bridge to make the settings take effect) |
@michael | so; |
@michael | essentially that would allow you to completely customize anything in the UI from emacs without a recompile, and tweak & fix the UI very much more quickly than at present: |
@michael | and exciting thought; again a project in need of acceleration by more hands. |
@michael | . |
@michael | So - finally - I guess, to get involved requires signing/mailing the JCA, I'd encourage you to do that; I did it myself as an individual, and Novell does it as a company; |
@michael | . |
@michael | so; |
@michael | I was asked to advertise a link before I finish: |
@michael | http://planet.go-oo.org/ |
@michael | is the RSS aggregator for several of the developers, |
@michael | I mentioned the IRC channel & the wiki above, |
@michael | I would avoid the (rather unhelpful) www.openoffice.org site personally |
@michael | . |
@michael | so - thanks for your patience: any questions ? :-) |
@michael | . |
@michael | then I will construct questions from eg. ThomasWal's previous comments: |
@michael | ThomasWal: why does the OO.o document relayout when fonts change |
@michael | that's a great question ! ;-) |
@michael | so - here is a real problem - that the font metrics - which specify how wide each glyph is, are different for the free fonts that are installed on your system to the 'standard' Microsoft ones, |
@michael | and that's really hard to fix; in fact almost impossible. |
@michael | the font metrics are in fact an integral part of your document - and if you're not using free fonts probably without realising it you're incorporating propriatory information into your document. |
@michael | this is -particularly- bad in 'impress' |
@michael | people that write presentatiosn tend to tweak the strings until the line -just- fits on the screen, |
@michael | unfortunately - any chance in metrics is more than likely to break that - pushing bullets off the bottom |
@michael | so - it's a real problem & virtually unfixable sadly. |
@michael | . |
@michael | ThomasWal: why do you claim Win32's disk I/O is intelligent when sometimes it uses more memory for caching than it can afford -> SWAP |
@michael | ThomasWal: a fair point of course, all I can say is that for me, on the same machine Win32 cold-starts OO.o in almost the same time as a warm start, whereas on Linux it's nearly 2x as slow - another 8 secs of I/O. |
@michael | ThomasWal: since the hardware is the same - that seems to point to some more intelligent disk block re-ordering, predictive I/O based on previous startups etc. |
@michael | ThomasWal: of course - I havn't re-measured recently with the state of the art on both sides so YMMV etc. but - pragmatically: it's substantially faster [ and OO.o on Linux already pulls some pre-loading speedup tricks to try to help ] |
@michael | . |
@michael | so - no more questions ? (and pseudo-questions ;-) |
@alejandro | heh, no more, just interested in the new ECMA standard and the negotiation |
@alejandro | is going to support the new format OpenOffice? |
@michael | ah - the new ECMA standard - I have no idea; but I'm leaving for Brussels for the inaugural TC45 meeting tomorrow, |
@michael | wrt. OO.o support for 'Office Open XML' ;-) no idea, I would hope so though of course, |
@michael | <sarnold> michael: any chance someone will fork glibc so that -Bdirect has a chance of being shipped? |
@michael | sarnold: well - of course all distributors maintain glibc forks (effectively) and can do so, |
@michael | sarnold: but - sure, I'd like to encourage Ulrich to accept the -Bdirect patches in the end incorporating the linking speedup, |
@michael | sarnold: maintaining a fork is an expensive business in terms of labour & testing of course, |
@michael | sarnold: and the problem is that 'prelink' is seen as the great-white-hope here, |
@michael | sarnold: however - IMHO prelink is not a particularly pleasant solution, pwrt. re-writing existing libraries, picking random locations for libs for security reasons etc. |
@michael | sarnold: my current work (just today) is re-ordering the linker's generated symbol tables to be far more cache efficient for relocation processing, and also hash misses / chaining ;-) |
@michael | sarnold: perhaps with a number of such fixups we can get near the performance of prelink without needing to 'prelink' :-) |
@michael | sarnold: good question though. |
@alejandro | with the new cycle release when is planned the new release? |
@michael | alejandro: ah - another good question, |
@michael | alejandro: so - the release details are in the wiki: |
@michael | eg. |
@michael | http://wiki.services.openoffice.org/wiki/OOoRelease202 |
@michael | http://wiki.services.openoffice.org/wiki/OOoRelease201 |
@michael | . |
@alejandro | nice, thanks. |
@michael | unfortunately 2.0.1 was/is delayed by various infrastructural failings of www.openoffice.org - such as the mailing lists not working ;-) [ cf. collab.net ] |
@michael | however - this of course will not delay 2.0.2 :-) |
@alejandro | then if there are not more questions, thanks michael for your presentation |
@michael | alejandro: thank you for inviting me :-) |
@alejandro | it was very interesting and I think there will be more people interested in hacking OOo now. :-) |
* michael must head back to the wife / bed. |
@michael | alejandro: glad you liked it; 'evening all. |
sarnold | michael: thanks :) |
Faelix | very good |
@gar | if he comes to brussels we could have a drink |
@alejandro | :-) |
* alejandro wants brussels beer. |
@alejandro | I think I'll come back the next year in FOSDEM. |
@gar | just give a scream and I'll be there |
@alejandro | ;-) |
@gar | even without the scream i'll be there I guess |
@alejandro | good to know it |
@alejandro | now I need to go rest, thanks for your assistance |
@alejandro | see you tomorrow |
@gar | yup, have a goodnight |