diff --git a/specs/as5/aegisub.png b/specs/as5/aegisub.png new file mode 100644 index 000000000..ba45df0d6 Binary files /dev/null and b/specs/as5/aegisub.png differ diff --git a/specs/as5/as5.pdf b/specs/as5/as5.pdf index 93fcb317d..92c634487 100644 Binary files a/specs/as5/as5.pdf and b/specs/as5/as5.pdf differ diff --git a/specs/as5/as5.tex b/specs/as5/as5.tex index 0a553b4a4..c53d3b9f1 100644 --- a/specs/as5/as5.tex +++ b/specs/as5/as5.tex @@ -1,4 +1,5 @@ \documentclass{spec} +\usepackage[pdftex]{graphicx} \newcommand{\syntax}[1]{ \subsubsection*{Syntax} @@ -14,14 +15,45 @@ } \newcommand{\secspec}[1]{Section:\>\texttt{#1}} \newcommand{\secspecs}[2]{Sections:\>\texttt{#1}, \texttt{#2}} +\newcommand{\HRule}{\rule{\linewidth}{0.5mm}} -\title{\LaTeX} -\date{} \begin{document} \title{AS5 Subtitle Format Draft} \author{Rodrigo Braz Monteiro, Niels Martin Hansen, David Lamparter} -\spectitle + +\begin{titlepage} +\begin{center} + +\vspace*{3cm} + +\HRule \\[0.5cm] +\textsc{\huge AS5 Subtitle Format}\\ +\HRule \\[1.1cm] +{\large By Rodrigo Braz Monteiro, Niels Martin Hansen and David Lamparter}\\[0.3cm] +This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License.\\ +\vfill + +\begin{minipage}{0.4\textwidth} +\begin{flushleft} \large +\includegraphics[width=0.7\textwidth]{./aegisub} +\end{flushleft} +\end{minipage} +\begin{minipage}{0.4\textwidth} +\begin{flushright} \large +\includegraphics[width=0.6\textwidth]{./asa} +\end{flushright} +\end{minipage}\\[1.5cm] + +{\large \today} + +\end{center} +\end{titlepage} + +\setlength{\parskip}{0pt} +\tableofcontents +\newpage +\setlength{\parskip}{8pt} \section{Abstract} @@ -45,7 +77,7 @@ improvement over SSA4 format (from which ASS, ASS2 and ASS3 derive). The full name of the format is "`AS5 Subtitle Format"'. -\section{File Structure} +\section{AS5 Files} \subsection{File Format} All AS5 files are \emph{REQUIRED} to comply with the three requirements below: @@ -61,6 +93,7 @@ That is, it must be a plain-text file. The character set of a subtitle file can be autodetermined by its Byte-Order Mark or by the value of the first two bytes. See below. + \subsection{File Structure} The file is divided in \emph{sections}, which are uniquely identified by a string inside square brackets, in a line of its own. From that point on, every next line is considered @@ -69,12 +102,12 @@ termination mark; they always end at the start of the next one or at the end of \emph{Section names are case sensitive.} Each section is divided in lines, each line representing one command or definition. Empty -lines \emph{MUST} be ignored. It is recommended that programs generating AS5 files insert -a blank line at the end of each section to increase readability. There \emph{MUST} always +lines \must\ be ignored. It is recommended that programs generating AS5 files insert +a blank line at the end of each section to increase readability. There \must\ always be a blank line at the end of the file (as every line is required to end in a line break). Each line in a section takes the general form of \textit{Type: data1,data2,...,dataN}. An -unknown \textit{Type} \emph{MUST} be ignored by a parser. It is recommended that subtitle +unknown \textit{Type} \must\ be ignored by a parser. It is recommended that subtitle editing programs keep such ignored lines in the file after re-saving it. Note that the space after the colon is \emph{mandatory}. @@ -82,18 +115,19 @@ There are two sections which are required, \emph{[AS5]} and \emph{[Events]}, the the equivalent of \emph{[Script Info]} in previous formats. If either of those sections is missing, the file is deemed invalid and \emph(MUST) be refused by the parser. Any other section can be ommitted from the file, and need not be implemented by all parsers. However, any unknown -section \emph{MUST} be preserved in the file by a subtitle editing program when it re-saves a +section \must\ be preserved in the file by a subtitle editing program when it re-saves a file with sections that it does not recognize. It can, however, be removed at the user's discretion. Finally, there is a special type of undefined group, \emph{[Private:PROGNAME]}, which -\emph{MUST} be \emph{ENTIRELY} preserved by other programs when re-saving it. This is used to +\must\ be \emph{ENTIRELY} preserved by other programs when re-saving it. This is used to store program-specific data, for example, Aegisub would create a group called \emph{[Private:Aegisub]} to store its data inside. This type of group should be identified by the fact that it starts with \emph{"`[Private:"'}. + \subsubsection{[AS5]} This must be the first section in every AS5 file. If the very first line of the file is not -[AS5], the file \emph{MUST} be rejected by the parser as invalid. Note, however, that the first +[AS5], the file \must\ be rejected by the parser as invalid. Note, however, that the first line is allowed to contain a Byte-Order Mark (BOM), which is the character U+FEFF encoded in the encoding used for the rest of the script\cite{Unicode BOM}. The first four bytes will therefore be: @@ -110,14 +144,14 @@ It is possible, therefore, to determine the encoding of the file by checking its This section is used to declare several script properties that affect its parsing and rendering. All properties are stored in the format \textit{Name: data}, with one property per line. -This section \emph{MUST} always declare the following properties: +This section \must\ always declare the following properties: \begin{itemize} \item ScriptType: Should always be set to \textit{AS5}, for this particular version of the specification. -If this contains a value that the parser does not understand, it \emph{MUST} abort parsing. +If this contains a value that the parser does not understand, it \must\ abort parsing. \item Resolution: Should contain the script resolution in \textit{WxH} format. For example, for a 640x480 script, this should say \textit{"`Resolution: 640x480"'}. Note that this does not need to correspond to the -video resolution, however, subtitles \emph{MUST} be rendered on such a coordinate space. That is, in a +video resolution, however, subtitles \must\ be rendered on such a coordinate space. That is, in a 640x480 script, \textbackslash{pos(320,240)} always represents the center of the script, no matter the resolution of the video it's being drawn on. Also, in a 100x100 script, a radius 50 circle centered on the center will always take half of the height and half of the width of the video, even if that means @@ -134,6 +168,10 @@ interaction. break lines or "`Automatic"', in which the renderer chooses how to break them. The default is "`Automatic"'. Note that if this is set to manual, the line can NEVER be broken at anywhere other than forced line breaks, even if it means that the line will become unreadable because it goes outside the display area. +\item Extensions: A comma-separated list of all extensions being used in this file. At the moment, there are +no extensions available. Renderers should read this to enable any extensions that they might support. +Editing programs \must\ keep this field intact, unless the user chooses otherwise. Scripts WILL break +if the list of extensions is suddenly lost. \item Credits: Credits for the people who worked on this subtitle file. Should be ignored by the renderer. \item Title: The title of this script. Should be ignored by the renderer. Subtitling programs may opt to display this title to the user. @@ -143,6 +181,7 @@ Although any other lines are allowed in this group, this is not encouraged, as t with future revisions of the format. Instead, they should be stored in \textit{[Private:PROGNAME]} groups, as mentioned above. + \subsubsection{[Events]} The most important section, [Events], lists all the actual subtitle lines in the file. Each line is @@ -160,7 +199,7 @@ never be displayed. If the end time is lesser than the start time, the renderer but should render the remaining lines regardless of the issue. \item Style: The name of the default style used for this line. See the [Style] section below. Should be left blank if you want to use the the script's global default style. If an unknown style is specified, -the renderer \emph{MUST} fallback to default, and might issue a warning. +the renderer \must\ fallback to default, and might issue a warning. \item User: This field is used by the program to store program-specific data in each line. Renderers should ignore this. This should be left blank if it's not used. \item Content: The actual text of the line. This contains actual text and override tags. See the section @@ -168,21 +207,81 @@ on override tags for more information. \end{itemize} The timestamp format is h...h:mm:ss[.s...], that is, it begins with an integer of arbitrary length -(up to a maximum of 4 digits) representing the number of hours, followed by two integers representing -minutes, and a floating point number representing seconds. Localization is irrelevant: a period ("`."') -is always used to separate the decimal point. This way, 0:21:42.5 and 0000:21:42.5000 are equivalent, -and both represent 0 hours, 21 minutes, 42 seconds and 500 miliseconds. +(up to a maximum of 4 digits) representing the number of hours, followed by a one-digit or two-digit integer +representing minutes, and a floating point number representing seconds. Localization is irrelevant: a +period ("`."') is always used to separate the decimal point. This way, 0:21:42.5 and 0000:21:42.5000 are +equivalent, and both represent 0 hours, 21 minutes, 42 seconds and 500 miliseconds. -Spaces between each field \emph{MUST} be ignored by all parsers. Any spaces at the beginning of the +Spaces between each field \must\ be ignored by all parsers. Any spaces at the beginning of the content line should be stripped. A hard space or empty override block should be used if space at the start of a line is truly desirable. That is, the two following lines are identical: \begin{verbatim} -Line: 0:12:31.57 , 0:12:34.22 , , , Hello world of {\b1}AS5{\b0}! -Line: 0:12:31.57,0:12:34.22,,,Hello world of {\b1}AS5{\b0}! +Line: 0:2:31.57 , 0:02:34.22 , , , Hello world of {\b1}AS5{\b0}! +Line: 0:02:31.570,00:02:34.22,,,Hello world of {\b1}AS5{\b0}! \end{verbatim} +\subsubsection{[Styles]} + +This is equivalent to the \emph{[V4 Styles]} (and subsequent variations) from the Sub Station Alpha format. +Each entry is declared as "`Style: name,parent,overrides"'. Like \emph{[Events]}, it has been greatly +simplified when compared to the previous formats, and now contains only three fields. They are: + +\begin{itemize} +\item Name: The name of this style. Style names are not case-sensitive, but \must\ be unique. A +script with conflicting style names must be refused by the parser. If the style name is "`Default"', it +will be used for all lines that ommit the style name. If there is no "`Default"' line, the renderer +default is used. +\item Parent: The style from which the current style derives from. See below for more information. +Leaving this field blank means that the style derives from the renderer's default style. +\item Overrides: A list of override tags to define this style. See below. +\end{itemize} + +Styles work in a very different way from the way they did on previous formats (with the notable exception +of ASS3, which actually implements this very same style based on this format, as "`StyleEx"'). +Instead of setting multiple parameters across many commas, you simply specify override tags. When a line +uses a style, it's as if the overrides of the style were inserted right before the start of the line +contents. + +Also, a style can inherit from another style, and define new overrides which are then appended to those +of the parent style. The parent style \must\ have been declared \emph{BEFORE} the style trying to use +it as a parent. If the parent doesn't exist or wasn't declared yet, the parser must refuse to parse the +script. This is important because otherwise you could get a "`inheritance loop"', where styles derive from +each other in a cycle. + +For example, see the following \emph{[Styles]} group: + +\begin{verbatim} +[Styles] +Style: Default,,\fn(Arial)\fs20 +Style: Speech,,\fn(Respublica)\fs24\bord2\shad2\4a#80\2c#000000 +Style: Actor1,Speech,\1c#B9C5E3 +Style: Actor2,Speech,\1c#FFB3CF +Style: UglinessItself,Default,\fn(Comic Sans MS) +\end{verbatim} + +In the above fragment, the first style defines the Default style that will be used on all lines that +don't set any style and the second style defines a base speech style that will be used for all actors +(note that it doesn't inherit from Default, even though Default overrode the renderer's default, that +one is still used for style definitions). + +The third and fourth styles are based on the second, and simply assign different colours to it. They +will both have all properties of Speech, and only differ in primary colour. Finally, the last example +shows how to derive from the overriden default. In this case, font size would be 20 points, regardless +of renderer's default. + +The two Actor styles could have been defined without a parent style as follows: + +\begin{verbatim} +[Styles] +Style: Actor1,,\fn(Respublica)\fs24\bord2\shad2\4a#80\2c#000000\1c#B9C5E3 +Style: Actor2,,\fn(Respublica)\fs24\bord2\shad2\4a#80\2c#000000\1c#FFB3CF +\end{verbatim} + +Since all that deriving a style from another does is append the new tags to the end of the previous, +this way of declaring styles is identical to the one above, but is more verbose. + \addcontentsline{toc}{section}{References} \begin{thebibliography}{1} diff --git a/specs/as5/asa.png b/specs/as5/asa.png new file mode 100644 index 000000000..fb4433fda Binary files /dev/null and b/specs/as5/asa.png differ