summaryrefslogtreecommitdiff
path: root/libavformat/matroskadec.c
Commit message (Collapse)AuthorAge
* avformat/matroskadec: Fix default value of BlockAddIDAndreas Rheinhardt2019-11-20
| | | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* avformat/matroskadec: don't rescale mastering display valuesJames Almer2019-10-05
| | | | | | Simplifies code. Signed-off-by: James Almer <jamrial@gmail.com>
* avformat/matroskadec: Fix demuxing ProResAndreas Rheinhardt2019-10-04
| | | | | | | | | | | | | | | | | | The structure of a ProRes frame in mov/mp4 is that of a typical atom: First a 32 bit BE size field, then a tag detailling the content. Said size field includes the eight bytes of the atom header. This header is actually redundant, as the size of the atom is already known from the containing atom. It is therefore stripped away when muxed into Matroska and so the Matroska demuxer has to recreate upon demuxing. But it did not account for the fact that the size field includes the size of the header and this can lead to problems when a decoder uses the in-band size field. Fixes ticket #8210. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* repeat an even number of characters in occuredMichael Niedermayer2019-09-16
| | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avformat/matroskadec: use av_fast_realloc to reallocate ebml list arraysJames Almer2019-09-04
| | | | | | | | | | Speeds up the process considerably. Fixes ticket #8109. Suggested-by: nevcairiel Suggested-by: cehoyos Signed-off-by: James Almer <jamrial@gmail.com>
* avformat/matroskadec: use proper types for some EbmlSyntax fieldsJames Almer2019-09-04
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* avformat/matroskadec: Fix seekingAndreas Rheinhardt2019-08-16
| | | | | | | | | | | | | | | | | matroska_reset_status (a function that is used during seeking (among other things)) used an int for the return value of avio_seek which returns an int64_t. Checking the return value then indicated an error even though the seek was successfull for targets in the range of 2GB-4GB, 6GB-8GB, ... This error implied that the status hasn't been reset and in particular, the old level was still considered to be in force, so that ebml_parse returned errors because the newly parsed elements were of course not contained in the previously active and still wrongly considered active master element any more. Addresses ticket #8084. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* matroskadec: Remove redundant constAndreas Rheinhardt2019-07-18
| | | | | | | | | | The typedef used to define EbmlSyntax already includes a const qualifier so that it is unnecessary to include another const qualifier in future definitions and declarations. Given that MSVC warns about this, this commit removes these redundant const qualifiers. Suggested-by: Hendrik Leppkes <h.leppkes@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* matroskadec: Add sizes to forward declarationsAndreas Rheinhardt2019-07-18
| | | | | | | | | | | | | | | | Unknown-length elements end when an element not allowed in them, but allowed at a higher level is encountered. In order to check for this, c1abd95a added a pointer to every syntax level's parent to each EbmlSyntax. Given that the parent must of course also reference the child in order to be able to enter said child level, one needs to use forward declarations. These forward declarations constitute tentative definitions and tentative definitions with internal linkage (like our syntaxes) must not be an incomplete type. Yet they were an incomplete type and while GCC and Clang did not even warn about this (on default warning levels), it broke compilation with MSVC. Therefore this commit adds the sizes. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Improve check for level 1 duplicatesAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | | | | | If a file uses unknown-length level 1 elements besides clusters and such elements are after the first cluster, then these elements will usually be parsed twice: Once during parsing of the file header and once when reading the file reaches the position where these elements are located. The second time the element is parsed leads to a "Duplicate element" error message. Known-length elements are not affected by this as they are skipped except during parsing the header. This commit fixes this by explicitly adding a check for whether the position of the element to be parsed is the same as the position of the already known level 1 element. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Use file offsets for level 1 elementsAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | This commit converts the MatroskaLevel1Element struct to use file-based offsets, as opposed to the current practice of using offsets relative to the beginning of the segment in it. This also includes a change from uint64_t to int64_t. This is in preparation to another patch that improves the check for duplicate level 1 elements. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Reindent after previous commitAndreas Rheinhardt2019-07-16
| | | | | | Also use the smallest scope possible for a loop variable. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Accept more unknown-length elements IIAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | Up until now, one last kind of unknown-length element hasn't been properly handled: Unknown-length elements that are supposed to be skipped, i.e. the level 1 elements that might reside after the clusters. This commit changes this. To do this, ebml_parse got a mode that essentially tries to skip everything except when parsing is needed (namely for unknown-length elements for which parsing is necessary as they can't be skipped). This mode is selected by using a NULL as destination where the parsed data should be written to. It is used to parse the level 1 elements in matroska_parse_cluster. The syntax list used for parsing must of course include links to the syntax of all the master elements that might need to be parsed. In other words: Instead of matroska_clusters (which contained every level 1 element except clusters as EBML_NONE elements designated to be skipped) matroska_segment is needed and used; matroska_clusters has been removed. Furthermore, matroska_segment has been reordered so that clusters are at the front as this is now the most common case for this list. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Fix probing of unknown-length headersAndreas Rheinhardt2019-07-16
| | | | | | | | matroska_probe did not support the case of an unknown-length EBML header at all; given that libavformat's Matroska muxer used to produce such files in the streaming case, support for them has been added. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Accept more unknown-length elementsAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current Matroska specifications mandate that only two elements may use an unknown-length length: Segments and clusters. But this was not always so: For the greater part of Matroska's existence, all master elements were allowed to make use of the unknown-length feature. And there were muxers creating such files: For several years libavformat's Matroska muxer used unknown-length for all master elements when the output wasn't seekable. This only stopped in March 2010 with 2529bb30. And even afterwards it was possible (albeit unlikely) for libavformat to create unknown-length master elements that are in violation of today's specifications, namely if the master element was so big that the seek backwards to update the size could no longer be performed inside the AVIOContext's write buffer. This has only been fixed in October 2016 (with the patches that introduced support for writing CRC-32 elements). Libavformat's Matroska demuxer meanwhile has never really supported unknown-length elements besides segments and clusters. Support for the latter was hardcoded. This commit changes this: Now all master elements for which a syntax to parse them is available are supported. This includes the files produced by old versions of libavformat's muxer. More precisely, master elements that have unknown length and are about to be parsed (not skipped) are supported; only a warning is emitted for them. For normal files, this means that level 1 elements after the clusters that are encountered after the clusters have been parsed (i.e. not because they are referenced by the seekhead at the beginning of the file) are still unsupported (they would be skipped at this point if their length were known). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Improve invalid length error handlingAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Up until now, the error message for EBML numbers whose length exceeds the limits imposed upon them because of the element's type did not distinguish between known-length and unknown-length elements. As a consequence, the numerical value of the define constant EBML_UNKNOWN_LENGTH was emitted as part of the error message which is of course not appropriate. This commit changes this by adding error messages designed for unknown-length elements. 2. We impose some (arbitrary) sanity checks on the lengths of certain element types; these checks were conducted before the checks depending on whether the element exceeds its containing master element. Now the order has been reversed, because a failure at the (formerly) latter check implies that the file is truly erroneous and not only fails our arbitrary length limit. Moreover, this increases the informativeness of the error messages. 3. Furthermore, the error message in general has been changed by replacing the type of the element (something internal to this demuxer and therefore suitable as debug output at best, not as an error message intended for ordinary users) with the element ID. The element's position has been added, too. 4. Finally, the length limit for EBML_NONE elements has been changed so that all unknown-length elements of EBML_NONE-type trigger an error. This is done because unknown-length elements can't be skipped and need to be parsed, but there is no syntax to parse available for EBML_NONE elements. This is done in preparation for a further patch which allows more unknown-length elements than just clusters and segments. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Don't skip too much when unseekableAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | The Matroska (and WebM) file format achieves forward-compability by insisting that demuxers ignore and skip elements they don't know about. Unfortunately, this complicates the detection of errors as errors resulting from loosing sync can't be reliably distinguished from unknown elements that are part of a future version of the standard. Up until now, the strategy to deal with this situation was to skip all unknown elements that are not obviously erroneous; if an error happened, it was tried to seek to the last known good position to resync from (and resync to level 1 elements). This is working fine if the input is seekable, but if it is not, then the skipped data can usually not be rechecked lateron. This is particularly acute if unknown-length clusters are in use, as the check for whether a child element exceeds the containing master element is ineffective in this situation. To remedy this, a new heuristic has been introduced: If an unknown element is encountered in non-seekable mode, an error is presumed to have happened based upon a combination of the length of the row of the already encountered unknown elements and of how far away skipping this element would take us. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Typos, nits and cosmeticsAndreas Rheinhardt2019-07-16
| | | | | | | | | Cosmetics include reordering EbmlType so that EBML_SINT is adjacent to the other numbers (and matches the order in the switch in ebml_parse) and also reordering the switch for assignment of default values so that it matches the order in EbmlType. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Reuse positionsAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | | | Up until now, avio_tell was used multiple times in ebml_parse and its subroutines, although the result of these calls can usually be simply derived from the result of earlier calls to avio_tell. This has been changed. Unnecessary calls to avio_tell in ebml_parse are avoided now. Furthermore, there has been a slight change in the output of some error messages relating to elements exceeding their containing master element: The reported position of the element now points to the first byte of the element ID and no longer to the first byte of the element's payload. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Redo EOF handlingAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | | | This commit closes the last hole in the system of checks for a known-length file ending too early: Now an error message is emitted in case the file ends directly after an EBML element. Furthermore, this commit adds a check and a corresponding warning whether there is data beyond the Matroska segment (only reasonable for known-length segments). If everything looks alright, then parsing is stopped as soon as EOF is reached (in contrast, the earlier code would always call matroska_resync at the end). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Combine arraysAndreas Rheinhardt2019-07-16
| | | | | | | | By including SimpleBlocks and BlockGroups twice in the same EbmlSyntax array (with different semantics), one can reduce the duplication of the other values. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Don't reset cluster positionAndreas Rheinhardt2019-07-16
| | | | | | | | | | The new code does not rely on whether the cluster's position is set or not to infer whether a cluster needs to be closed or not (instead, this is done in ebml_parse), so there is no need to reset the cluster's position at all any more. It will be automatically set to the correct value when a cluster is entered. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Make cluster parsing level compatibleAndreas Rheinhardt2019-07-16
| | | | | | | | | Before this commit, the parsing of clusters mixed EBML levels by allowing elements from different levels in a EbmlSyntax (namely matroska_cluster_parsing). This has been changed. And the level is now explicitly used to determine how to parse. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Redo level handlingAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit changes how levels are handled: If the level used for ebml_parse ends directly after an element that has been consumed, then ebml_parse ends the level itself (and any known-length levels that end there as well) and informs the caller via the return value; if the current level is of unknown-length, then the level is ended as soon as an element that is not valid on the current level, but on a higher level is encountered (or if EOF has been encountered). This is designed for situations where one wants to parse master elements incrementally, i.e. not in one go via ebml_parse_nest. The (incremental) parsing of clusters still mixes levels by using a syntax list that contains elements from different levels and the level is still ended manually via a call to ebml_level_end if the last cluster was an unknown-length cluster (known-length clusters are already ended when their last element is read), but only if the next element is a cluster, too. A different level 1 element following an unknown-length cluster will currently simply be presumed to be part of the earlier cluster. Fixing this will be done in a future patch. The modifications to matroska_parse_cluster contained in this patch are only intended not to cause regressions. Nevertheless, the fact that known-length levels are automatically ended in ebml_parse when their last element has been read already fixes a bogus error message introduced in 9326117b that was emitted when a known-length cluster is followed by another level 1 element other than a cluster in which case the cluster's level was not ended (which only happened when a new cluster has been encountered) so that the length check (introduced in 9326117b) failed for the level 1 element as it is of course not contained in the previous cluster. Most Matroska files were affected by this. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Link to parents in syntax tablesAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | | | | | | | | | | | | By linking to the syntax of the parent (i.e. the containing master element) one can check whether an element is actually part of a higher level in the EBML hierarchy. Knowing this is important for unknown-length levels, because they end when an element that doesn't belong to this, but to a higher hierarchy level is encountered. Sometimes there are different syntaxes dealing with the same elements. In this case it is important to use a parent that contains all the elements at the parent level; whether this is the syntax actually used to enter the child's level is irrelevant. This affects the list of level 1 elements (which has been used as parent for matroska_cluster, too) and it affects recursive elements (currently only the SimpleTag), where the non-recursive parent has to be choosen. This is in preparation for a patch that redoes level handling. Finally, the segment id has been added to ebml_syntax. This will enable handling of unknown-length EBML headers. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Introduce a "last known good" positionAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, resyncing during reading packets works as follows: The current position is recorded, then a call to matroska_parse_cluster is made and if said call fails, the demuxer tries to resync from the earlier position. If the call doesn't fail, but also doesn't deliver a packet, then this is looped. There are two problems with this approach: 1. The Matroska file format aims to be forward-compatible; to achieve this, a demuxer should simply ignore and skip elements it doesn't know about. But it is not possible to reliably distinguish unknown elements from junk. If matroska_parse_cluster encounters an unknown element, it can therefore not simply error out; instead it returns zero and the loop is iterated which includes an update of the position that is intended to be used in case of errors, i.e. the element that is skipped is not searched for level 1 element ids to resync to at all if later calls to matroska_parse_cluster return an error. Notice that in case that sync has been lost there can be a chain of several unknown/possibly junk elements before an error is detected. 2. Even if a call to matroska_parse_cluster delivers a packet, this does not mean that everything is fine. E.g. it might be that some of the block's data is missing and that the data that was presumed to be from the block just read actually contains the beginning of the next element. This will only be apparent at the next call of matroska_read_packet, which uses the (false) end of the earlier block as resync position so that in the (not unlikely) case that the call to matroska_parse_cluster fails, the data believed to be part of the earlier block is not searched for a level 1 element to resync to. To counter this, a "last known good" position is introduced. When an element id that is known to be allowed at this position in the hierarchy (according to the syntax currently in use for parsing) is read and some further checks (regarding the length of the element and its containing master element) are passed, then the beginning of the current element is treated as a "good" position and recorded as such in the MatroskaDemuxContext. Because of 2., only the start of the element is treated as a "good" position, not the whole element. If an error occurs later during parsing of clusters, the resync process starts at the last known good position. Given that when the header is damaged the subsequent resync never skips over data and is therefore unaffected by both issues, the "last known good" concept is not used there. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Refactor some functionsAndreas Rheinhardt2019-07-16
| | | | | | | | | | | Since the changes to the parsing of SimpleBlocks, both ebml_parse_id and ebml_parse_elem are only called from one place, so that it is possible to inline these two function calls. This is done, but not completely: ebml_parse_id still exists in a modified form. This is done in preparation for a further patch regarding the handling of unknown-length elements. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Use proper levels after discontínuityAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | The earlier code set the level to zero upon seeking and after a discontinuity although in both cases parsing (re)starts at a level 1 element. Also set the segment's length to unkown if an error occured in order not to drop any valid data that happens to be beyond the designated end of the segment. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Add function to reset statusAndreas Rheinhardt2019-07-16
| | | | | | | This function will be useful later to reset the status (e.g. current level and the already parsed id). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Don't abort resyncing upon seek failureAndreas Rheinhardt2019-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an error happens, the Matroska demuxer tries to resync to level 1 elements from an earlier position onwards. If the seek to said earlier position fails, the demuxer currently treats this as an unrecoverable error. And that behaviour is suboptimal as said failure is nothing unrecoverable or unexpected (when the input isn't seekable). It is preferable to simply resync from the earliest position available (i.e. the start of the AVIOContext's buffer) onwards if the seek failed. Here are some scenarios that might be treated as unrecoverable errors by the current code if the input isn't seekable. They all have in common that the current position is so far away from the desired position that the seek can't be fulfilled from the AVIOContext's buffer: 1. Blocks (both SimpleBlocks as well as a Block in a BlockGroup) for which reading them as binary EBML elements succeeds, but whose parsing triggers an error (e.g. an invalid TrackNumber). In this case the earlier position from which resyncing begins is at the start of the block (or even earlier). 2. BlockGroups, whose parsing fails in one of the latter elements. Just as in 1., the start of the BlockGroup (the target of the seek) might be so far away from the current position that it is no longer in the buffer. 3. At the beginning of parsing a cluster, the cluster is parsed until a SimpleBlock or a BlockGroup is encountered. So if the input is damaged between the beginning of the cluster and the first occurrence of a SimpleBlock/BlockGroup and if said damage makes the demuxer read/skip so much data that the beginning of the cluster is no longer in the buffer, demuxing will currently fail completely. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* matroskadec: Fix overflow introduced in a569a7b3Andreas Rheinhardt2019-07-06
| | | | | | | | | | | | | This commit fixes an overflow introduced in a569a7b3 that affected EBML elements that the Matroska demuxer doesn't want to parse like CRC-32 elements. The return value of avio_skip (the new position on success or an AVERROR on failure) has been assigned to an integer which meant that new positions in the range of 2GB to 4GB-1 etc. were considered errors. Fixes ticket #8001. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* avformat/matroskadec: Improve error/EOF checks IIIAndreas Rheinhardt2019-06-24
| | | | | | | | | | | Up until now, when an element was skipped, it was relied upon ffio_limit to make sure that there is enough data available to skip. ffio_limit itself relies upon the availability of the file's size. As this needn't be available, the check has been refined: First one byte less than intended is skipped, then another byte is read, followed by a check of the error flags. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Improve read error/EOF checks IIAndreas Rheinhardt2019-06-24
| | | | | | | | | | | | | | | | | This commit fixes a number of bugs: 1. There was no check that no read error/EOF occured during ebml_read_uint, ebml_read_sint and ebml_read_float. 2. ebml_read_ascii and ebml_read_binary did sometimes not forward error codes; instead they simply returned AVERROR(EIO). 3. In particular, AVERROR_EOF hasn't been used and no dedicated error message for it existed. This has been changed. In order to reduce code duplication, the new error code NEEDS_CHECKING has been introduced which makes ebml_parse check the AVIOContext's status for errors. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Improve read error/EOF checks IAndreas Rheinhardt2019-06-24
| | | | | | | | | | | | | | | | | | | | ebml_read_num had a number of flaws: 1. The check for read errors/EOF was totally wrong. E.g. an EBML number beginning with the invalid 0x00 would be considered a read error, although it is just invalid data. 2. The check for read errors/EOF was done just once, after reading the first byte of the EBML number. But errors/EOF can happen inbetween, of course, and this wasn't checked. 3. There was no way to distinguish when EOF should be an error (because the data has to be there) for which an error message should be emitted and when it is not necessarily an error (namely during parsing of EBML IDs). Such a possibility has been added and used. All this was fixed; furthermore, the error messages for invalid EBML numbers were improved and useless initializations were removed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Properly check return valuesAndreas Rheinhardt2019-06-24
| | | | | | | | Up until now, webm_dash_manifest_cues used the return values of ebml_read_num and ebml_read_length without checking for errors, i.e. return values < 0. This has been changed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Don't zero unnecessarilyAndreas Rheinhardt2019-06-24
| | | | | | | | | | It is only necessary to zero the initial allocated memory used to store the size of laced frames if the block used Xiph lacing. Otherwise no unintialized data was ever used, so use av_malloc instead of av_mallocz. Also use the correct type for the allocations. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Treat SimpleBlock as EBML_BINAndreas Rheinhardt2019-06-23
| | | | | | | | | | | | | | | Up until now, the SimpleBlock was treated specially: It basically had its own EBML category and it was also included in the BlockGroup EBML syntax (although a SimpleBlock must not exist in a BlockGroup according to the Matroska specifications). The latter fact also meant that a MatroskaBlock's buffer was always unreferenced twice. This has been changed: The type of a SimpleBlock is now an EBML_BIN. The only way in which SimpleBlocks are still different is that they share their associated structure with another unit (namely BlockGroup). This is also used to unref the block: It is always unreferenced via the BlockGroup syntax. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Don't keep old blocksAndreas Rheinhardt2019-06-23
| | | | | | | | | | Before this commit, the Matroska muxer would read a block when required to do so, parse the block, create and return the necessary AVPackets and yet keep the blocks (in a dynamically allocated list), although they aren't used at all any more. This has been changed. There is no list any more and the block is immediately discarded after parsing. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Remove redundant initializationAndreas Rheinhardt2019-06-23
| | | | | | | | | Every new element of an EbmlList is zeroed initially in ebml_parse_elem, so that in particular a SimpleBlock's duration is initialized to zero. Therefore it is unnecessary to initialize this field again (for SimpleBlocks) in matroska_parse_cluster_incremental. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Set offset of first clusterAndreas Rheinhardt2019-06-23
| | | | | | | | | | | | | | | | | By default, the data_offset member of the AVFormatInternal of the AVFormatContext associated with the MatroskaDemuxContext has not been initialized explicitly by any Matroska-specific function, so that it was initialized by default to the offset at the end of matroska_read_header, i.e. usually to the offset of the length field of the first encountered cluster. This meant that in case that the Matroska-specific seek-code fails because there are no index entries for the target track a seek to data_offset would be performed and ordinary parsing would start from there which is nonsense: The length field would be treated as EBML ID and (if the length field is not longer than four bytes (EBML numbers that long are rejected as invalid EBML IDs)) whatever comes next would be treated as its EBML size although it simply isn't. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Get rid of cluster size field assumptionAndreas Rheinhardt2019-06-23
| | | | | | | | | | The earlier code relied on the length of clusters always being coded on eight bytes as was the behaviour of libavformat's Matroska muxer until recently. But given that our own Matroska muxer now (and mkvmerge from time immemorial) creates files that don't conform to this assumption, it is high time to get rid of this assumption. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Remove non-incremental parsing of clustersAndreas Rheinhardt2019-06-22
| | | | | | | | | When the new incremental parser was introduced, the old parser was kept, because the new parser was unable to handle the way SSA packets are put into Matroska. But since 2014 (since c7d8dbad) this is no longer needed, so that the old parser can be completely removed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Use generic size check for signed integersAndreas Rheinhardt2019-06-22
| | | | | | | and drop the redundant checks contained in ebml_read_uint and ebml_read_sint. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Don't copy attached picturesAndreas Rheinhardt2019-06-22
| | | | | | | This commit replaces copying attached pictures by using references to the already existing buffers. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avformat/matroskadec: Compactify structureAndreas Rheinhardt2019-06-07
| | | | | | | | | | | | | Matroska EBML IDs can be only four bytes long maximally, so it is natural to use uint32_t for them. By doing this and rearranging the elements of the MatroskaLevel1Element structure, one can reduce the size of said structure. Notice that this field is not read via the generic reading process for EBML_UINT, so one is not forced to use an uint64_t for it. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avformat/matroskadec: Correct outdated error messageAndreas Rheinhardt2019-06-07
| | | | | | | This error message is outdated since d31fb1a9. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avformat/matroskadec: Remove unused variablesAndreas Rheinhardt2019-06-07
| | | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avformat/matroskadec: Improve length checkAndreas Rheinhardt via ffmpeg-devel2019-04-05
| | | | | | | | | | | | | | | | | | | | | The earlier code had three flaws: 1. The case of an unknown-sized element inside a finite-sized element (which is against the specifications) was not caught. 2. The error message wasn't helpful: It compared the length of the child with the offset of the end of the parent and claimed that the first exceeds the latter, although that is not necessarily true. 3. Unknown-sized elements that are not parsed can't be skipped. Given that according to the Matroska specifications only the segment and the clusters can be of unknown-size, this is handled by not allowing any other units to have infinite size whereas the earlier code would seek back by 1 byte upon encountering an infinite-size element that ought to be skipped. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* lavf: Constify the probe function argument.Carl Eugen Hoyos2019-03-21
| | | | | Reviewed-by: Lauri Kasanen Reviewed-by: Tomas Härdin
* avformat:matroskadec: use a define to mark the EBML length is unknownSteve Lhomme2019-02-24
| | | | | | | | | | Unifying the way the EBML unknown length is signaled, rather than using two incompatible values. UINT64_MAX cannot be read as a valid EBML length with the current code. Co-authored-by: Steve Lhomme <robux4@ycbcr.xyz> Co-authored-by: Dale Curtis <dalecurtis@chromium.org> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>