Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New docs area for URL semantics, relocate some HTTP pages here #35202

Merged
merged 17 commits into from
Aug 26, 2024
Merged
2 changes: 1 addition & 1 deletion files/en-us/web/http/status/421/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ spec-urls: https://www.rfc-editor.org/rfc/rfc9110#name-421-misdirected-request
{{HTTPSidebar}}

The HTTP **`421 Misdirected Request`** [client error response](/en-US/docs/Web/HTTP/Status#client_error_responses) status code indicates that the request was directed to a server that is not able to produce a response.
This can be sent by a server that is not configured to produce responses for the combination of [scheme](/en-US/docs/Web/URI#scheme) and [authority](/en-US/docs/Web/URI#host_name) that are included in the request URI.
This can be sent by a server that is not configured to produce responses for the combination of [scheme](/en-US/docs/Web/URI/Schemes) and [authority](/en-US/docs/Web/URI/Authority) that are included in the request URI.

Clients may retry the request over a different connection.

Expand Down
46 changes: 46 additions & 0 deletions files/en-us/web/uri/authority/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
title: URI authority
slug: Web/URI/Authority
page-type: guide
spec-urls: https://www.rfc-editor.org/rfc/rfc3986.html#section-3.1
---

{{QuickLinksWithSubpages("/en-US/docs/Web/URI")}}

The **authority** of a URI is the part of the URI that comes after the scheme and before the path. It consists of three parts: user information, host, and port.
Josh-Cena marked this conversation as resolved.
Show resolved Hide resolved

## Syntax

```url
host
host:port
user@host
user@host:port
```

- host
- : The _host_ is usually the domain name or IP address of the server hosting the resource. The domain name is resolved to an IP address using the {{glossary("DNS", "Domain Name System")}}.
- port
Josh-Cena marked this conversation as resolved.
Show resolved Hide resolved
- : The _port_ is a number that indicates the port on which the server is listening for requests. It is optional and defaults to 80 for HTTP and 443 for HTTPS. Other schemes may define their own defaults or make it mandatory.
- user
Josh-Cena marked this conversation as resolved.
Show resolved Hide resolved

- : The _user_ is optional and is used for authentication purposes. It is not commonly used in web URIs.

> [!WARNING]
> Providing user information directly in the URI is not recommended, as it can expose sensitive information. Use other methods like HTTP authentication or session cookies instead. Sometimes, phishing sites trick users by display misleading URLs whose "user" part appears as if it's a domain name, known as [semantic URL attack](https://en.wikipedia.org/wiki/Semantic_URL_attack).

## Examples

- `developer.mozilla.org`
- : The host is `developer.mozilla.org`. The port is not specified but will default to 443 if accessed via `https:`.
- `localhost:8080`
- : The host is `localhost` and the port is `8080`. `localhost` is a special host name that the browser resolves to the local address `127.0.0.1`.
- `postgres:admin123@db:5432`
- : The host is `db`, and the port is `5432`. It also specifies a user `postgres` and its password `admin123`. This can be used to connect to a PostgreSQL database.
- `cnn.example.com&[email protected]`
- : A misleading URL that looks like it's pointing to a trusted website. However, the host name is `10.0.0.1`, and the `cnn.example.com&story=breaking_news` part is the "user".

## See also

- [URIs](/en-US/docs/Web/URI)
- [Choosing between www and non-www URLs](/en-US/docs/Web/URI/Authority/Choosing_between_www_and_non-www_URLs)
38 changes: 38 additions & 0 deletions files/en-us/web/uri/fragment/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: URI fragment
slug: Web/URI/Fragment
page-type: guide
spec-urls: https://www.rfc-editor.org/rfc/rfc3986.html#section-3.5
---

{{QuickLinksWithSubpages("/en-US/docs/Web/URI")}}

The **fragment** of a URI is the last part of the URI, starting with the `#` character. It is used to identify a specific part of the resource, such as a section of a document or a position in a video. The fragment is not sent to the server when the URI is requested, but it is processed by the client (such as the browser) after the resource is retrieved.

## Syntax

```url
#fragment
```

- fragment

- : A sequence of any characters. The exact format of the fragment is defined by the resource itself. Some common examples:

- In an HTML document, it can be the [`id`](/en-US/docs/Web/HTML/Global_attributes/id) attribute of an element, and the browser will scroll to that element.
- It can be a [text fragment](/en-US/docs/Web/URI/Fragment/Text_fragments) in the form of `#:~:text=...`, which makes the browser highlight the specified text.
- It can be a [media fragment](https://www.w3.org/TR/media-frags/) in the form of `#t=...`, which makes the video or audio start playing from that time.

## Examples

- `#syntax`
- : The browser will scroll to the element with the `id="syntax"` in the document (which, for this page, is the [Syntax](#syntax) heading).
- `#:~:text=fragment`
- : The browser will highlight the text `fragment` in the document.
- `#t=10,20`
- : The video or audio will start playing from the 10th second.

## See also

- [URIs](/en-US/docs/Web/URI)
- [Text fragments](/en-US/docs/Web/URI/Fragment/Text_fragments)
44 changes: 27 additions & 17 deletions files/en-us/web/uri/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,33 +57,43 @@ http://www.example.com:80/path/to/myfile.html?key1=value1&key2=value2#SomewhereI

`http://` is the [_scheme_](/en-US/docs/Web/URI/Schemes) of the URL, indicating which protocol the browser must use. Usually it is the HTTP protocol or its secured version, HTTPS. The Web requires one of these two, but browsers also know how to handle other protocols such as `mailto:` (to open a mail client) or `ftp:` to handle a file transfer, so don't be surprised if you see such protocols. Common schemes are:

| Scheme | Description |
| -------------------------------- | -------------------------------------------------------------------- |
| data | [Data URLs](/en-US/docs/Web/URI/Schemes/data) |
| file | Host-specific file names |
| ftp | {{Glossary("FTP","File Transfer Protocol")}} |
| http/https | [Hyper text transfer protocol (Secure)](/en-US/docs/Glossary/HTTP) |
| javascript | URL-embedded JavaScript code |
| mailto | Electronic mail address |
| resource {{Non-standard_inline}} | Firefox and Firefox browser extensions to load resources internally |
| ssh | Secure shell |
| tel | telephone |
| urn | Uniform Resource Names |
| view-source | Source code of the resource |
| ws/wss | [WebSocket connections (Secure)](/en-US/docs/Web/API/WebSockets_API) |
- [`blob`](/en-US/docs/Web/API/URL/createObjectURL_static)
Josh-Cena marked this conversation as resolved.
Show resolved Hide resolved
- : Binary Large Object; a pointer to a large in-memory object
- [`data`](/en-US/docs/Web/URI/Schemes/data)
- : Data directly embedded in the URL
- `file`
- : Host-specific file names
- `ftp`
- : {{Glossary("FTP","File Transfer Protocol")}}
- `http/https`
- : [Hyper text transfer protocol (Secure)](/en-US/docs/Glossary/HTTP)
- `javascript`
- : URL-embedded JavaScript code
- `mailto`
- : Electronic mail address
- [`resource`](/en-US/docs/Web/URI/Schemes/resource) {{Non-standard_inline}}
- : Firefox and Firefox browser extensions to load resources internally
- `ssh`
- : Secure shell
- `tel`
- : telephone
- `urn`
- : Uniform Resource Names
- `view-source`
- : Source code of the resource
- `ws/wss`
- : [WebSocket connections (Secure)](/en-US/docs/Web/API/WebSockets_API)

When using URLs in {{Glossary("HTML")}} content, you should generally only use a few of these URL schemes. When referring to subresources — that is, files that are being loaded as part of a larger document — you should only use the HTTP and HTTPS schemes. Increasingly, browsers are removing support for using FTP to load subresources, for security reasons.

FTP is still acceptable at the top level (such as typed directly into the browser's URL bar, or the target of a link), although some browsers may delegate loading FTP content to another application.

### Host name
### Authority

Josh-Cena marked this conversation as resolved.
Show resolved Hide resolved
![Domain Name]([email protected])
hamishwillee marked this conversation as resolved.
Show resolved Hide resolved

`www.example.com` is the _host name_, or domain name, indicating which Web server is being requested. Alternatively, it is possible to directly use an {{Glossary("IP address")}}, but because it is less convenient, it is not often used on the Web.
Josh-Cena marked this conversation as resolved.
Show resolved Hide resolved

### Port

![Port]([email protected])

`:80` is the _port_ or the URL, indicating the technical "gate" used to access the resources on the web server. It is usually omitted if the web server uses the standard ports of the HTTP protocol (80 for HTTP and 443 for HTTPS) to grant access to its resources. Otherwise, it is mandatory.
Josh-Cena marked this conversation as resolved.
Show resolved Hide resolved
Expand Down
51 changes: 51 additions & 0 deletions files/en-us/web/uri/schemes/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
title: URI schemes
hamishwillee marked this conversation as resolved.
Show resolved Hide resolved
slug: Web/URI/Schemes
page-type: guide
spec-urls: https://www.rfc-editor.org/rfc/rfc3986.html#section-3.1
---

{{QuickLinksWithSubpages("/en-US/docs/Web/URI")}}

The **scheme** of a URI is the first part of the URI, before the `:` character. It indicates which protocol the browser must use to fetch the resource. The scheme may affect how the rest of the URI is structured and interpreted.

## Syntax

```url
protocol:
```

- protocol
- : A sequence of characters that identifies the protocol to use. It should consist of only alphanumeric characters and the `+`, `-`, and `.` characters. Common schemes are:
- [`blob`](/en-US/docs/Web/API/URL/createObjectURL_static)
- : Binary Large Object; a pointer to a large in-memory object
- [`data`](/en-US/docs/Web/URI/Schemes/data)
- : Data directly embedded in the URL
- `file`
- : Host-specific file names
- `ftp`
- : {{Glossary("FTP","File Transfer Protocol")}}
- `http/https`
Josh-Cena marked this conversation as resolved.
Show resolved Hide resolved
- : [Hyper text transfer protocol (Secure)](/en-US/docs/Glossary/HTTP)
Josh-Cena marked this conversation as resolved.
Show resolved Hide resolved
- `javascript`
- : URL-embedded JavaScript code
- `mailto`
- : Electronic mail address
- [`resource`](/en-US/docs/Web/URI/Schemes/resource) {{Non-standard_inline}}
- : Firefox and Firefox browser extensions to load resources internally
- `ssh`
- : Secure shell
- `tel`
- : telephone
- `urn`
- : Uniform Resource Names
- `view-source`
- : Source code of the resource
- `ws/wss`
Josh-Cena marked this conversation as resolved.
Show resolved Hide resolved
- : [WebSocket connections (Secure)](/en-US/docs/Web/API/WebSockets_API)

## See also

- [URIs](/en-US/docs/Web/URI)
- [Data URLs](/en-US/docs/Web/URI/Schemes/data)
- [Resource URLs](/en-US/docs/Web/URI/Schemes/resource)