Web APIs W-W Page
I created this page because when implementing Notifications and the Service Worker for this website, I came across the mdn web docs page on Web APIs, and I noticed how many Web APIs I haven't tried out that looked interesting. I am going to use this page to learn more about the Web APIs and implement them to test their functionality. Because there are 137 Web APIs available, I am going to paginate this project so that one page does not have too much information.
The list below lists the Web APIs that are available on various platforms, and it provides links to notes that I have taken on them / implementations of the various Web APIs on this site.
- A-B
- C-C
- D-F
- G-I
- K-P
- Launch Handler API
- Local Font Access API
- Media Capabilities API
- Media Capture and Stream
- Media Session API
- Media Source Extensions
- MediaStream Recording
- Navigation API
- Network Information API
- Page Visibility API
- Payment Handler API
- Payment Request API
- Performance API
- Periodic Background Sync
- Permissions API
- Picture-in-Picture API
- Pointer Event
- Pointer Lock API
- Popover API
- Presentation API
- Prioritized Task Scheduling API
- Push API
- R-S
- T-V
- W-W
- Web Audio API
- Web Authentication API
- Web Component
- Web Crypto API
- Web Locks API
- Web MIDI API
- Web NFC API
- Web Notification
- Web Serial API
- Web Share API
- Web Speech API
- Web Storage API
- Web Workers API
- WebCodecs API
- WebG
- WebGPU API
- WebHID API
- WebOTP API
- WebRTC
- WebSockets API
- WebTransport API
- WebUSB API
- WebVR API
- WebVTT
- WebXR Device API
- Window Controls Overlay API
- Window Management API
- Web Audio API Reference
- Introductory Web Audio Tutorial
- Basic Concepts Behind the Web Audio API
- Advanced Tutorial
The Web Audio API provides a powerful and versatile system for controlling audio on the Web, allowing developers to choose audio sources, add effects to audio, create audio visualizations, apply spatial effects (such as panning) and much more.
The Web Audio API involves handling audio operations inside an audio context, and has been designed to allow modular routing. Basic audio operations are performed with audio nodes, which are linked together to form an audio routing graph. Several sources - with different types of channel layout - are supported even within a single context. This modular design provides the flexibility to create complex audio functions with dynamic effects.
They typically start with one or more sources. Sources provide arrays of sound intensities (samples) at very small timescales, often tens of thousands of them per second. These could be computed mathematically via OscillatorNode or they can be recordings from sound/video files (like AudioBufferSourceNode and MediaElementAudioSourceNode) and audio streams (MediaStreamAudioSourceNode). In fact, sound files are just recordings of sound intensities themselves, which come in from microphones or electric instruments, and get mixed down into a single, complicated wave.
Outputs of these nodes could be linked to inputs of others, which mix or modify these streams of sound samples into different streams. A common modification is modifying the samples by a value to make them louder or quieter (as is the case with GainNode). Once the sound has been sufficiently processed for the intended effect, it can be linkedin to the input of a destination (BasedAudioContext.destination), which sends the sound to the speakers or headphones. A typical workflow for web audio would look something like:
- Create audio context.
- Inside the context, create sources - such as <audio>, osciallator, stream
- Create effects nodes, such as reverb, biquad filter, panner, compressor
- Choose the final destination of audio, for example your system speakers
- Connect the sources up to the effects, and the effects to the destination.
Timing is controlled with high precision and low latency, allowing developers to write code that responds accurately to events and is able to target specific samples, even at a high sample rate. So applications such as drum machines and sequencers are well within reach. The Web Audio API also allows us to control how audio is spatialized. Using a system based on a source-listener model, it allows control of the panning model and deals with distance-induced attenuation induced by a moving source (or moving listener).
The Web Authentication API (WebAuthn) is an extension of the Credential Management API that enables strong authentication with public key cryptography, enabling passwordless authentication and secure multi-factor authentication (MFA) without SMS texts.
WebAuthn uses asymmetric (public-key) cryptography instead of passwords or SMS texts for registering, authenticating, and multi-factor authentication with websites. Some benefits:
- Protection against phishing: An attacker who creates a fake login website can't login as the user because the signature changes with the origin of the website.
- Reduced impact of data breaches: Developers don't need to hash the public key, and if an attacker gets access to the public key used to verify the authentication, it can't authenticate because it needs the private key.
- Invulnerable to password attacks: Some users might reuse passwords, and an attacker may obtain the user's password for another website.
Many website have pages that allow users to register new accounts or sign into an existing account, and WebAuthn acts as a replacement or enhancement for the authentication part of the system.
Web Components is a suite of different technologies allowing you to create reusable custom elements - with their functionality encapsulated away from the rest of your code - ans utilize them in your web apps.
Web Components consists of three main technologies, which can be used together to create custom elements with encapsulated functionality that can be reused wherever you like without fear of code collisions.
- Custom Elements: A set of JavaScript APIs that allow you to define custom elements and their behavior, which can then be used as desired in your user interface.
- Shadow DOM: A set of JavaScript APIs for attaching an encapsulated "shadow" DOM tree to an element - which is rendered separately from the main document DOM - and controlling associated functionality.
- HTML Templates: The <template> and <slot> elements enable you to write markup templates that are not displayed in the rendered page.
the Web Locks API allows scripts running in one tab or worker to asynchronously acquire a lock, hold it while work is performed, then release it. While held, no other script executing in the same origin can acquire the same lock, which allows a web app running in multiple tabs or workers to coordinate work and the use of resources.
The Web MIDI API connects to and interacts with Musical Instrument Digital Interface (MIDI) Devices. The interfaces deal with the practical aspects of sending and receiving MIDI messages. Therefore, the API can be used for musical and non-musical uses, with any MIDI device connected to your computer.
The Web NFC API allows exchanging data over NFC via light-weight NFC Data Exchange Format (NDEF) messages.
The Notifications API allows web pages to control the display of system notifications to the end user. These are outside the top-level browsing context viewport, so therefore can be displayed even when the user has switched tabs or moved to a different app. The API is designed to be compatible with existing notification systems, across different platforms.
The Web Serial API provides a way for websites to read from and write to serial devices. These devices may be connected via a serial port, or be USB or Bluetooth devices that emulate a serial port.
The Web Serial API is one of a set of APIs that allow websites to communicate with peripherals connected to a user's computer. It provides the ability to connect to devices that are required by the operating system to communicate via the serial API, rather than USB which can be accessed via the WebUSB API, or input devices that can be accessed via WebHID API.
The Web Share API provides a mechanisms for sharing text, links, files, and other content to an arbitrary share target selected by the user.
The APU allows a site to share text, links, files, and other content to user-selected share targets, utilizing the sharing mechanisms of the underlying operating system. These share targets typically include the system clipboard, email, contacts or messaging applications, and Bluetooth or Wi-Fi channels. The API has two methods. The navigator.canShare() method may be used to first validate whether some data is "shareable", prior to passing it to navigator.share() for sending.
The Web Speech API enables you to incorporate voice data into web apps. The Web Speech API has two parts: Speech Synthesis (Text-to-Speech), and SpeechRecognition (Asynchronous Speech Recognition).
The Web Speech API makes web apps able to handle voice data. There are two components to this API:
- Speech recognition is accessed via the SpeechRecogntion interface, which provides the ability to recognize voice context from an audio input (normally via the device's default speech recognition service) and respond appropriately. Generally you'll use the interface's constructor to create a new SpeechRecognition object, which has a number of event handlers available for detecting when speech is input through the device's microphone. The SpeechGrammar interface represents a container for a particular set of grammar that your app should recognize. Grammar is defined using JSpeech Grammar Format
- Speech synthesis is accessed via the SpeechSynthesis interface, a text-to-speech component that allows programs to read out their text content (normally via the device's default speech synthesizer). Different voice types are represented by SpeechSynthesisVoice objects, and different parts of text that you want to be spoken are represented by SpeechSynthesisUtterance objects.
The Web Storage API provides mechanisms by which browsers can store key/value pairs, in a much more intuitive fashion than using cookies.
The two mechanisms within WebStorage are as follows:
- sessionStorage maintains a separate storage area for each given origin that's available for the duration of the page session
- localStorage does the same thing, but persists when the browser is close and reopened.
These mechanisms are available via the Window.sessionStorage and Window.localStorage properties. Invoking one of these will return an instance of a Storage object, through which data items can be set, retrieved and removed. Both sessionStorage and localStorage in WebStorage are synchronous in nature. This means that when data is set retrieved, or removed from these storage mechanisms, the operations are performed synchronously, blocking the execution of other JavaScript code until the operation is completed. Developers should be cautious when using Storage that involves tasks that involve a significant amount of data or computationally intensive tasks.
Web Workers makes it possible to run a script operation in a background thread separate from the main execution thread of a web application. The advantage of this is that laborious processing can be performed in a separate thread, allowing the main (usually the UI) thread to run without being blocked / slowed down.
A worker is an object created using a constructor (e.g. Worker()) that runs a named JavaScript file - this file contains the code that will be run in the worker thread. You can;t directly manipulate the DOM from inside the worker, or use some default methods and properties of the Window object. Data is sent between workers and the main thread via a system of messages - both sides send their messages using the postMessage() method, and respond to messages via the onmessage event handler (the message is contained within the message event's data property). The data is copied rather than shared.
The WebCodecs API gives web developers low-level access to the individual frames of a video stream and chunks of audio. It is useful for web applications that require full control over the way media is processed. For example, video or audio editors, and video conferencing.
The WebCodecs API provides access to codecs that are already in the browser. it gives access to raw video frames, chunks of audio data, image decoders, audio and video encoders and decoders.
WebGL (Web Graphics Library) is a JavaScript APU for rendering high-performance interactive 3D and 2D graphics within any comparable web browser without the use of plug-ins. WebGL does so by introducing an APU tht closely conforms to OpenGL ES 2.0 that can be used in HTML <canvas> elements. This conformance makes it possible for the API to take advantage of hardware graphics acceleration provided by the user's device.
The WebGPU API enables web developers to use the underlying system's GPU (Graphics Processing Unit) to cary out high-performance computations and draw complex images that can be rendered in thr browser. WebGPU is the successor to WebGL, providing better compatibility with modern GPUs, support for general-purpose GPU computations, faster operations, and access to more advanced GPU features.
WebGL revolutionized the web in terms of graphical capabilities after it first appeared around 2011. Several apps have been created to make WebGL apps easier to write: Three.js, Babylon.js, and PlayCanvas. WebGPU addresses the issues with WebGL, providing an updated general-purpose architecture compatible with modern GPU APIs, which feels more "webby". There are several layers of abstraction between a device GPU and a web browser running the WebGPU API. It is useful to understand these as you begin to learn WebGPU:
- Physical devices have GPUs. Most devices only have one GPU, but some have more than one. Different GPU types are available:
- Integrated GPUs, which live on the same board as the CP and share its memory
- Discrete GPUsm which live on their own board, separate from the CPU
- Software "GPUs", implemented on the CPU
- A native GPU API, which is part of the OS, is a programming interface allowing native applications to use the capabilities of the GPU. API instructions are sent to the GPU (and responses received) via a driver. It is possible for a system to have multiple native OS APIs and drivers available to communicate with the GPU, although the above diagram assumes a device with only one API/driver.
- A browser;s WebGPU implementation handles communicating with the GPU via a native GPU API driver. A WebGPU adapter effectively represents a physical GPU and driver available in the underlying system, in your code.
- A logical device is an abstraction via which a single web app can access GPU capabilities in a compartmentalized way. Logical devices are required to provide multiplexing capabilities. A physical device's GPU is used by many applications and processes concurrently, including potentially many web apps. Each web app needs to be able to access WebGPU is isolation for security and logic reasons.
A Human Interface Device (HID) is a type of device that takes input from or provides output to humans. It also refers to the HID protocol, a standard for bi-directional communication between a host and a device that is designed to simplify the installation procedure. The HID was originally developed for USB devices but has since been implemented over many protocols, including Bluetooth.
The WebOTP API provides a streamlined user experience for web apps to verify that a phone number belongs to verify that a phone number belongs to a user when using it as a sign-in factor. WebOTP is an extension of the Credential manager API.
The verification is done via a 2 step process:
- The app client requests a one-time password (OTP), which is obtained from a specially-formatted SMS message sent by the app server.
- JavaScript is used to enter the OTP into a validation form on the app client and it is submitted back to teh server to verify that it matches what was originally sent in the SMS.
WebRTC (Web Real-Time Communication) is a technology that enables Web applications and sites to capture and optionally stream audio and/or video media, as well as to exchange arbitrary data between browsers without requiring an intermediary. The set of standards that comprise WebRTC makes it possible to share data and perform teleconferencing peer-to-peer, without requiring that the user install plug-ins or any other third-party software. WebRTC consists of several interrelated APIs and protocols which work together to achieve this.
WebRTC serves multiple purposes; together with the Media Capture and Streams API, they provide powerful multimedia capabilities to the Web, including support for audio and video conferencing, file exchange, screen sharing, identify management, and interface with legacy telephone systems including support for sending DTMF (touch tone) signals.
The WebSocket API makes it possible to open a two-way interactive communication session between the user's browser and a server. WIth this API, you can send messages to a server and receive responses without having to poll the sever for a reply.
The WebTransport API provides a modern update to WebSockets, transmitting data between client and server using HTTP/3 Transport. WebTransport provides support for multiple streams, unidirectional streams, and out-of-order delivery. It enables reliable transport via streams and unreliable transport via UDP-like datagrams.
HTTP/3 has been in progress since 2018. It is based on Google's QUIC protocol and fixes several issues around the classic TCP protocol, on which HTTP and WebSockets are based. These concepts include:
- Head-of-line blocking: HTTP/2 allows multiplexing, so a single connection can stream multiple resources simultaneously. However, if a single resource fails, all other resources on that connection are held up until all missing packets are retransmitted. With QUIC, only the failing resource is affected.
- Faster Performance: QUIC is more performant than TCP in many ways. QUIC can handle security features by itself, rather than handling responsibility off to other protocols like TLS - meaning fewer round trips. And streams provide better transport efficiency than the older packet mechanism. That can make a significant difference, especially on high-latency networks.
- Better network transitionsQUIC is more performant than TCP in many ways. QUIC can handle security features by itself, rather than handling responsibility off to other protocols like TLS - meaning fewer round trips. Streams provide better transport efficiency than the older packet mechanism.
- Unreliable Transport: HTTP/3 supports unreliable data transmission via datagrams.
The WebUSB API provides a way to expose non-standard Universal Serial Bus (USB) compatible devices services to the web, to make USB safer and easier to use.
USB is the de-facto standard for wired peripherals. The USB devices that you connect to your computer are typically grouped into a number of device classes - such as keyboards, mice, video devices, and so on. WebUSB provides a way for these non-standardized USB device services to be exposed to the web. This means that hardware manufacturers will be able to provide a way for their device to be accessed from the web, without having to provide their own API.
No Longer recommended
WebVR provides support for exposing virtual realty devices - for example, head mounted displays like Oculus Rift or HTC Vive - to web apps, enabling developes to translate position and movement information from the display into movement around a 3D scene.
Any VR devices attached to your computer will be returned by the Navigator.getVRDisplays() method; each one will be represented by a VRDisplay object. The WebVR API, which was never ratified as a Web standard has been deprecared in favor of the WebXR API, which is well on track toward finishing the standardization process.
Web Video Text Tracks (WebVTT) are text tracks providing text "cues" that are time-aligned with other media, such as video or audio tracks. The WebVTT API provides functionality to define and manipulate these text tracks. The WebVTT API is primarily used for displaying subtitles or captions that overlay with video content, but it has other uses: providing chapter information for easier navigation and generic metadata that needs to be time-aligned with audio or video content.
A text track is a container for time-aligned text data that can be played in parallel with a video or audio track to provide a translation, transcription, or overview of the content. A video or audio media element may define tracks of different kinds or in different languages, allowing users to display appropriate tracks based on their preferences or needs. The different kinds of text data that can be specified are listed below.
- subtitles provide a textual translation of spoken dialog. This is the default type of text track, and if used, the source language must be specified.
- captionsprovide a transcription of spoken text, and may include information about other audio such as music or background noise. They are intended for hearing impaired users.
- chapters provide high level navigation information, allowing users to more easily switch to relevant content
- metadata is used ofr any other kinds of time-aligned information
WebXR is a group of standards which are used together to support rendering 3D scenes to hardware designed for presenting virtual words (virtual reality (VR)), or for adding graphical imagery to the real world, (augmented reality (AR)). The WebXR Device API implements the core of the WebXR feature set, managing the selection of output devices, render the 3D sceme to the chosen device at the appropriate frame rate, and manage notion vectors created using input controllers.
WebXR-compatible devices include fully-immersive 3D headsets with motion and orientation tracking, eyeglasses which overlay graphics atop the real-world scene passing through the frames, and handheld mobile phones which augment reality by capturing the world with a camera and augment that scene with computer-generated imagery. The WebXR Device API provides the following key capabilities:
- Find a compatible VR or AR output devices
- Render a 3D scene to the device at an appropriate frame rate
- (Optionally) mirror the output to a 2D display
- Create vectors representing the movements of input controls
At the most basic level, a scene is presented in 3D by computing the perspective to apply to the scene in order to render it from the viewpoint of each of the user's eyes by computing the position of each eye and rendering the scene from that position, looking in the direction the user is currently facing. Each of these two images is rendered into a single framebuffer, with the left eye's rendered image on the left and the right eye's viewpoint rendered on the right half od the buffer. Once both eyes' perspectives on the scene have been rendered, the resulting framebuffer is delivered to the WebXR device to be presented to the user through their headet or other appropriate display device.
The Window Controls Overlay API gives Progressive Web Apps installed on desktop operating systems the ability to hide the default window title bar and display their own content over the full surface area of the app window, turning the control buttons (maximize, minimize, and close) into an overlay.
Before using this feature, the following conditions must be true:
- The Web App Manifest's display_override member must be set to window-controls-overlay
- The Progressive Web App ust be installed on a desktop operating system.
The Window Management API allows you to get detailed information on the displays connected to your device and more easily place windows on specific screens, paving the way towards more effective multi-screen applications.
The Windows Management API provides robust, flexible window management. It allows you to query whether your display is extended with multiple screens and get information on each screen separately: windows can then be placed on each screen as desired. It also provides event handlers to allow you to respond to changes in teh available screens. It is useful in cases such as:
- Multi-window graphics editors and audio processors that may wish to arrange editing tools and panels across different screens.
- Virtual trading desks that want to show market trends in multiple windows and put specific windows of interest in fullscreen mode
- Slideshow apps that want to show speaker notes on the internal primary screen and presentation on an external projector.