| Zero Width Space | U+200B | An invisible character (U+200B) with zero width, often used by AI systems for text watermarking and tracking without affecting visual appearance. | ↑ Zero-width space between "Hello" and "World" |
| Zero Width Joiner | U+200D | A non-printing character (U+200D) that joins adjacent characters, commonly used in complex scripts and as a steganographic watermark in AI-generated content. | ↑ Family emoji combined using ZWJ |
| Zero Width Non-Joiner | U+200C | An invisible character (U+200C) that prevents the joining of adjacent characters, used in typography and as a hidden marker in AI text detection. | |
| Word Joiner | U+2060 | An invisible character (U+2060) that prevents line breaks between words, ensuring text stays together while serving as a potential fingerprinting mechanism. | ↑ Prevents "$100" from breaking |
| Non-Breaking Space | U+00A0 | A space character (U+00A0) that prevents automatic line breaks, commonly used for proper text formatting and occasionally for content tracking. | ↑ Non-breaking space between number and unit |
| En/Em Spaces | U+2002 | Fixed-width space characters (U+2002, U+2003) used in professional typography, with widths equal to 'n' and 'm' characters respectively. | |
| Ideographic Space | U+3000 | A full-width space character (U+3000) used in CJK (Chinese, Japanese, Korean) text formatting, matching the width of ideographic characters. | |
| Em Dash | U+2014 | A long dash (U+2014) used for punctuation breaks in text, longer than a hyphen and often used to indicate a pause or break in thought. | |
| Line Separator | U+2028 | A formatting character (U+2028) that marks line boundaries in text, used for proper text segmentation without starting a new paragraph. | ↑ Line separator marks line boundary |
| Paragraph Separator | U+2029 | A formatting character (U+2029) that explicitly marks paragraph boundaries, providing semantic structure to text content. | |
| Left-to-Right Mark | U+200E | A non-visible character (U+200E) that enforces left-to-right text direction, essential for mixed-direction text in multilingual documents. | ↑ LTR mark controls direction |
| Right-to-Left Mark | U+200F | A non-visible character (U+200F) that enforces right-to-left text direction, crucial for proper rendering of Arabic, Hebrew, and other RTL languages. | ↑ RTL mark controls direction |
| Byte Order Mark | U+FEFF | A special character (U+FEFF) at the start of a file indicating encoding format (UTF-8, UTF-16), essential for proper text file interpretation. | ↑ BOM at the start of UTF-8 file |
| Regular Space | U+0020 | The standard space character (U+0020) used for word separation in text, the most common whitespace character in all languages. | ↑ Standard space character |
| Tab Character | U+0009 | A horizontal tab character (U+0009) used for indentation and alignment, creating consistent spacing in code and formatted text. | ↑ Tab character for alignment |
| Line Break | U+000A | Newline characters (LF U+000A or CRLF) that mark the end of a line, creating new lines in text documents across different operating systems. | ↑ Line break creates new line |