Replies: 4 comments 10 replies
-
Servo
Not sure if this is helpful for others. I learned a little about the various collaborators involved in Servo, which provides useful context. |
Beta Was this translation helpful? Give feedback.
-
GeckoThe Gecko engine is written primarily in C++ and makes heavy use of the observer pattern and interface-like super classes. There also appears to be two code paths that are used, one of them deprecated, depending on whether older HTML or HTML5 is being parsed. Gecko is also used at the core of Firefox, which is a full-featured browser with all of the functionality one takes for granted. All of this together this makes the code more difficult to follow than Servo for someone new to it. What follows is very tentative.
Loading a page from a URLLet's start in the vicinity of the docshell class, which is an important coordinating class in the Gecko engine. (The code involved here is gnarly.)
|
Beta Was this translation helpful? Give feedback.
-
ChromiumChromium is a huge code base. It took me hours to download the project, and then forever to compile it (just for fun). Somehow it is easier to navigate than Gecko. To someone new to C++ like myself, Gecko gives the vague impression of being more abstract and hard to reason about in some ways. Much of the Chromium code base deals with the user interface. Chromium delegates its HTML handling to Blink, a rendering engine. It is in Blink that tokenization, parsing, DOM tree building and related things happen. I downloaded and compiled Chromium by following these instructions, which were mostly straightforward. Along the way I noticed that there are thousands of interesting markdown docs on relevant topics that I might want to skim at some point (HTML here). |
Beta Was this translation helpful? Give feedback.
-
WebKitWebKit is the core of Safari and, in some form, Chrome. Its origins are in the KHTML and KJS libraries from KDE, a Linux desktop environment. It is the starting point for Chrome's Blink rendering engine, which is a fork of WebKit's WebCore component. I recall hearing that the Chrome team went with this approach because WebKit was very fast. With some effort I downloaded the sources for WebKit and compiled them. $ gh repo clone WebKit/WebKit It was a little fiddly to compile the project. I ended up installing a bunch of development libraries, discovering some that would be hard to track down, and then running these commands: $ cmake -DPORT=GTK -DUSE_JPEGXL=OFF -DUSE_OPENJPEG=OFF -DCMAKE_BUILD_TYPE=RelWithDebInfo -GNinja -DUSE_WOFF2=OFF -DUSE_LCMS=OFF -DUSE_LIBBACKTRACE=OFF
$ ninja It took several hours to compile the sources to a binary. To run the binary, I had to tell
|
Beta Was this translation helpful? Give feedback.
-
In the case of Servo, Gecko, Chromium, and other open source browsers, how do the tokenizer, parser and DOM interact? Is there an intermediate representation, in which an IR tree is first built up and then converted into DOM nodes? Is the DOM passed around and mutated in place without the benefit of an IR? At what point does script execution take over, and what does that handoff look like? What are the different collaborating classes and structs that are involved to make all of this happen?
Beta Was this translation helpful? Give feedback.
All reactions