Rockbox Audio system design proposal 1) High level requirements The Rockbox audio system should fullfil the following high-level requirements (in decreasing priority) : 1. Provide a seamless audio playback 2. Minimize energy consumption 3. Minimize latency during flow breaking operations (track change, FW/BW) 4. Have the largest possible amount of codec independant code in firmware 5. Have the largest possible amount of generic (ie non hardware/player specific) code 2) High level design The Rockbox audio system contains 3 subsystems : 1. The Feeder, that submits uncompressed audio data to the output audio driver (or hardware). 2. The Codecs, that take chunks of compressed audio data, decompress them and provide them in uncompressed form to the Feeder. Additionnaly, the Codecs are responsible to provide Rockbox information on the tracks they are decoding (ie track length, sampling rate, elapsed/remaining time etc.) 3. The Loader, that reads compressed data from the hard disk and provides it to the Codecs. The Feeder tries to minimize the number of disk accesses. The Feeder and Loader are part of Rockbox, while the Codecs are plug-ins. The Codecs provide data to the Feeder via a small shared memory buffer (or two buffers to support cross-fading) referred hereafter as the "dec-buffer". The Codecs see the Loader as a caching file system. They request data from the Loader via a stream-like API (ie Open, Read, Seek) but receive pointers to the large buffers managed by the Loader, rather than having to provide their own buffers (as usual stream APIs require). The Loader managed buffers will be referred hereafter as the "cache". 3) Normal flow of operations The normal and most common flow of operations occurs while playing in the middle of a track. It is as follows : Feeder : 1. The Feeder awakes (usually from an interrupt) when the previous audio output operation is about to finish. 2. The Feeder submits the next chunk of data to the output audio driver. 3. It notices that the dec-buffer is not full anymore. 4. It wakes up the currently active Codec and requires it to provide enough decompressed data to fill the dec-buffer. 5. The Feeder sleeps, awaiting the next event (step 1). Codec : 1. The active Codec is awaken by the Feeder and notices the request for uncompressed data. 2. It checks whether its internal output buffer still has some uncompressed data and if yes transfers it to the dec-buffer. 3. If the dec-buffer is not full yet, it requests a block of uncompressed data from the Loader. 4. It receives a pointer to the cache and starts decompressing. 5. The decompressed data is output directly to the dec-buffer if possible, otherwise it is decompressed in its internal output buffer and copied to the dec-buffer when the block has been fully processed. 6. If the dec-buffer is not full yet, repeat from step 3. 7. If the dec-buffer is full before the block has been fully processed, the extra data is put in reserve in the internal buffer for the next time step 2 is executed. 8. The Codec updates its internal structures, so that it knows which block it has to request the next time it will be waken up. 9. The Codec updates the elapsed/remaining time in the current track structure for Rockbox to display. 10. The Codec sleeps, awaiting the next event (step 1). Loader : 1. Loader is called by the Codec (step 3. above). 2. It checks its internal directory to find out whether the Codec requested data is completely in the cache or not. 3. If it is only partially in the cache, it decides whether to partially fullfil the request by returning only the part of data it has in the cache, or to fetch the missing data from disk and completely fulfill the request (non streaming Codec). This case should seldomly occur. 4. It launches the cache reevaluation thread (step 6). 5. It returns a pointer to the cache for the Codec. 6. The cache reavalution thread is awaken (from step 4). 7. It checks whether the cache consumption has reached a predefined threshold (for instance 85% has been used). 8. If not, it does nothing and stops. 9. If yes, it invalidates the consumed part of the cache and fills it by spinning the hard disk and reading compressed data from files, according to the current playlist. 10. It updates its internal directory to reflect the file that have been flushed from/read to the cache. 11. It sleeps, awaiting the next event (step 6). An alternative to the sequential calling sequence Feeder->Codec and Loader->Cache reevaluation would be to have the Codec and Cache reevaluation to be waken up periodically and check whether they have something to do. 4) Starting The flow of operation when the user requests playing a track : Rockbox : 1. The users requests playing a track (from a playlist). 2. Rockbox determines the Codec required for the current track and initialises the current track structure. 3. Rockbox updates the current playlist. 4. Rockbox launches the Feeder. Feeder : 1. The Feeder notices it has no data to output and calls the Codec to provide it uncompressed data. 2. The Feeder sleeps until the Codec has filled the dec-buffer (step 5. below) Codec : 1. The Codec notices that the current track structure has just been initialized. 2. It calls the Loader to obtain the track properties (for instance by seeking to the end of file for accessing ID3 tags). 3. It updates the current track structure accordingly 4. Like steps 3. to 9. of the normal flow in 3) 5. The Codec launches the Feeder asynchronously. 6. The Codec sleeps, awaiting the next event. 5) End of track The flow of operations when a track is about to end (gapless version) : Feeder : 1. The Feeder notices that the remaining time of the current track is below a given threshold. 2. It calls Rockbox to prepare the playback of the next track and initialize the next track structure. 3. It continues its normal operations like in 3) Codec (current track instance) : 1. In step 5. of 3) the Codec notices that it has not enough uncompressed audio data to fill the dec-buffer. 2. It flags the current track structure as finished and indicates, to the intention of the Feeder, how many bytes it has actually written in the dec-buffer. 3. It releases resources it might have acquired, as it is the last time this instance will be active. Feeder (later) : 4. The Feeder notices that the Codec could not completely fill the dec-buffer. 5. It wakes up the Codec for the next track, passing it the next track structure and requires it to fill the rest of the dec-buffer. Codec (next track instance) : 1. Like steps 1. to 4. of 4) with "current" track structure replaced by "next" track structure. The uncompressed data of the beginning of the next track is put in the dec-buffer directly after the data of the ending of the current track, ensuring gapless playback. 5. The Codec sleeps, awaiting the next event. It should be assumed, if the Loader has correctly done its job, that in step 1., the Codec will quickly receive compressed data that has been already read ahead in the cache. Feeder (later) : 6. Feeder notices that the chunk of data it submits to the audio driver contains data from two tracks. 7. It copies the data of the next track structure to the current track stucture. If the users does not want gapeless playback, the Feeder has to handle it in step 6 by playing only the end of the current track, wait for the gap duration and then play the beginning of the next track. For crossfading, two more buffers would be required : one extra dec-buffer for the uncompressed data provided by the next track Codec instance and one mix-buffer containing the weighted sum of both dec-buffers. The threshold in 1. should be larger (several seconds) and the normal operation flow should be enhanced so that the Feeder submits data from the mix-buffer to the audio output driver and calls both the current and next track Codecs to keep both dec-buffers full. 6) Flow break The flow of operations when the user requests playing something different than the normal forward play according to playlist (next track, previous track, begin of track, FF/BW) Rockbox : 1. Rockbox notices the user requires a flow break 2. If the flow break is within the current track, update the current track structure accordingly (time elapsed) and flag the change, so that the Codec is informed of the flow break. It is assumed that the Codec knows how to reach the updated time elapsed position. 3. If the flow break implies a track change, determine the Codec required for the new track and initialise the current track structure. Update the playlist. 4. Empty the dec-buffer. 5. Continue with step 4. of 4). With luck (especially for next track, begin of track or FF/BW), the compressed data will already be in the cache and the latency will be small. 7) Feeder Most of the Feeder has been described above. The Feeder must run at a high priority to ensure seamless playback. It is assumed that it will execute in short bursts, launching hardware driven background operations of the audio driver (DMA transfer, asynchronous interrupts). The Feeder should also provide hooks for postprocessing on uncompressed data (filters like resampling, gain etc.) 8) Codecs Codecs must be able to provide the properties (like ID3 tags) of a track via an API. Codecs must be able to decode the audio in several invocations. It means that they should have the ability to decode a certain amount of data, transfer it to the dec-buffer and store enough status information to be able to quickly resume decoding at a later point. The internal output buffer mentionned in 3) has the aim of handling the mismatches between the different block sizes the Codecs use and the chunks size used by the Feeder. The Codecs should also provide hooks for Codec specific processing on compressed or decompressing data (for instance a bandwidth equalizer can be more efficently applied to the audio data while it still is in the frequency domain). 9) Loader The loader provides a stream-like API to the Codecs. When calling the Loader, the Codecs have the possibility to indicate hints, helping the Loader applying its optimizing strategy. These hints could be : - Read audio data, streaming mode -> Loader knows that audio data before the requested block is candidate for discarding - Read audio data, non streaming mode -> Loader knows that it has to keep the whole file in cache until finished playing - Read non audio data (eg tags) -> In the case of a file too large to be completely loaded into the cache, the Loader will have to fetch the non audio data from the disk (usually at the end of the file) but will not try to fetch more data from the next track (as it would do for a normal audio data read). The Loader will have to handle different cases, ranging from having several small files in the cache to having only part of a large file in it. it could also possibly have several part of the same file in the cache (audio and non audio data of large files). Its strategy is to minimize the hard disk spin ups (ie grouping accesses to disk), assuming a forward play flow in the playlist. 10) Issues The following issues are to be solved : a) Resource arbitration between the subsystems, mainly : - several or even all the subsystems might be executing in parallel. The CPU/memory allocation must ensure that the audio playback has the highest priority and Loader disk read the lowest one. - The producer/consumer problem between the Feeder and Codecs (dec-buffer) and between the Codecs and Loader (cache). - Common structures, like the current track are written/read by several actors. b) Optimal sizing of the dec-buffer(s), the cache and the Codecs working set. c) Seeking backwards or forwards might be difficult for some codecs : Rockbox will indicate them the time position they must reach, the Codecs have to find by themselves, the appropriate offset in the compressed data stream. This could mean making a first guess, then scanning the data, possibly having to decode part of it and dropping the samples until the suitable position has been reached. d) Having the Codecs providing the track elapsed time will not be very accurate, as they are decoding ahead of the actual playing and the resolution will be that of a block. Rockbox might display a compensated value or this task might be delegated to the Feeder. e) The handling of filters is not detailed. f) The case of files with the same extension (containers like .ogg) that might require different Codecs is not detailed. Basically, it could be solved by having a generic Codec that will try to start decompressing the audio, like a normal Codec. However, instead of decompressing it, it will identify the apropriate Codec and replace itself with that Codec. Alternatively, Rockbox could require that such files have different extensions, like .oggv for Vorbis, .oggs for Speex and so on. h) Only hard disk access is considered as an energy hungry activity. The design should also consider other activities, like CPU/Memory usage and the ability to speed the CPU up or down.