Frecuency of ASCII characters in Typescript and Javascript code

Overview

Frecuency of ASCII characters in Typescript and Javascript code

When building a tokenizer to parse Typescript or Javascript (esbuild, SWC, TSC, etc) consider this finding when writing the switch statement cases and their order, the least checks you make the faster ir will be. Not all characters are created equal, a whitespace (100) is more that 1500 times more frecuent than % (37)

How the character counter works

  • It traversers all the folders and subfolders recursively that it founds inside the data forlder in the root directory of this repository
  • Finds every file but only counts the ones with the following extensions
    • .ts
    • .tsx
    • .js"
    • .jsx
    • .mts
    • .mjs
    • .cts
    • .cjs
    • .d.ts
    • .d.mts
    • .d.cts
  • Scans every character (reading the ASCII code) and count the number of occurrences

Run it yourself

  • Create a data and inside clone the repositories you want to scan

  • Make sure you have go installed, as well node. Also npm i

  • Run the scripts npm run start

  • Check the results

Outputs

output.json: The key is the ascii code, the value is the number of occurrences.

stdout: A table with the escaped names, ascii codes and occurrences. Note that all the letters (a - z, A - Z) and digits (0 - 9) are grouped

output.csv: Same as above but in csv format

Results

The following is an example ouput after running the script with the following repositories in the data folder:

  • Angular
  • Vue
  • definitely-typed
  • node
  • react
  • Typescript
Character CharCode Occurrences
"A - z" --- 283946350
" " 32 118426366
"\n" 10 12939620
"0 - 9" --- 12032791
"/" 47 7656952
"." 46 5719973
"," 44 5261314
"*" 42 4817435
"(" 40 4596610
")" 41 4576496
":" 58 4410275
";" 59 4259587
""" 34 2987244
"=" 61 2972430
"'" 39 2243891
"{" 123 2129247
"}" 125 2127271
"\r" 13 1913084
"-" 45 1848648
">" 62 1526829
"_" 95 1382751
"|" 124 1185122
"[" 91 1165403
"]" 93 1143123
"<" 60 674655
"`" 96 666969
"?" 63 615641
"+" 43 593973
"@" 64 583870
"&" 38 317592
"^" 94 301090
"\" 92 273808
"!" 33 204297
"$" 36 176215
"å" 229 166608
"#" 35 146205
"æ" 230 109732
"ç" 231 87046
" " 9 79940
"%" 37 74606
"è" 232 73751
"ä" 228 53278
"Ð" 208 42002
"à" 224 38155
"é" 233 37039
"â" 226 30489
"É" 201 28294
"ï" 239 28069
"ã" 227 25070
"×" 215 20241
"\u001b" 27 20113
"Ñ" 209 17768
"Î" 206 17690
"~" 126 13372
"Ï" 207 8630
"ì" 236 8361
"Ã" 195 8231
"Â" 194 7939
"Ø" 216 6893
"ë" 235 5871
"Ù" 217 4212
"ð" 240 4032
"í" 237 2765
"Å" 197 2620
"á" 225 2444
"ê" 234 2276
"\u0000" 0 1674
"Ä" 196 1223
"Û" 219 1168
"Ì" 204 752
"Õ" 213 476
"Ú" 218 436
"Ë" 203 294
"Ò" 210 250
"Ö" 214 205
"Í" 205 190
"Æ" 198 166
"î" 238 136
"Ó" 211 113
"Ç" 199 98
"Ê" 202 88
"È" 200 77
"Ô" 212 74
"\u0006" 6 70
"\u0001" 1 41
"\u007f" 127 20
"Þ" 222 19
"\u0004" 4 10
"\u0015" 21 10
"\u001f" 31 10
"ô" 244 4
"\u000e" 14 3
"\b" 8 2
"\u0019" 25 2
"\u0007" 7 1
"\u000b" 11 1
"\u0018" 24 1
"ò" 242 1

Table generated with this website

Charts

All characters

All characters ploted

Without letters and digits

All characters but letters and digits ploted

Without letters and digits and whitespaces

All characters but letters, digits and whitespaces ploted

You might also like...
[Go] Package of validators and sanitizers for strings, numerics, slices and structs

govalidator A package of validators and sanitizers for strings, structs and collections. Based on validator.js. Installation Make sure that Go is inst

Take screenshots of websites and create PDF from HTML pages using chromium and docker

gochro is a small docker image with chromium installed and a golang based webserver to interact wit it. It can be used to take screenshots of w

Parse data and test fixtures from markdown files, and patch them programmatically, too.

go-testmark Do you need test fixtures and example data for your project, in a language agnostic way? Do you want it to be easy to combine with documen

Watches container registries for new and changed tags and creates an RSS feed for detected changes.

Tagwatch Watches container registries for new and changed tags and creates an RSS feed for detected changes. Configuration Tagwatch is configured thro

A general purpose application and library for aligning text.

align A general purpose application that aligns text The focus of this application is to provide a fast, efficient, and useful tool for aligning text.

Parse placeholder and wildcard text commands

allot allot is a small Golang library to match and parse commands with pre-defined strings. For example use allot to define a list of commands your CL

Elegant Scraper and Crawler Framework for Golang

Colly Lightning Fast and Elegant Scraping Framework for Gophers Colly provides a clean interface to write any kind of crawler/scraper/spider. With Col

Encoding and decoding for fixed-width formatted data

fixedwidth Package fixedwidth provides encoding and decoding for fixed-width formatted Data. go get github.com/ianlopshire/go-fixedwidth Usage Struct

A Go library to parse and format vCard

go-vcard A Go library to parse and format vCard. Usage f, err := os.Open("cards.vcf") if err != nil { log.Fatal(err) } defer f.Close() dec := vcard.

Owner
Elian Cordoba
Elian Cordoba
Go-banner-printer - This library is to simply print a ASCII banner when you start the application

This library is to simply print a ASCII banner when you start the application.

Sathesh Sivashanmugam 1 Jan 18, 2022
:triangular_ruler:gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt.

gofmtmd gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt. Installation $ go get github.com/po3rin/gofm

po3rin 91 Oct 31, 2022
:evergreen_tree: Parses indented code and returns a tree structure.

codetree Parses indented code (Python, Pug, Stylus, Pixy, codetree, etc.) and returns a tree structure. Installation go get github.com/aerogo/codetree

Aero 22 Sep 27, 2022
:zap: Transfer files over wifi from your computer to your mobile device by scanning a QR code without leaving the terminal.

$ qrcp Transfer files over Wi-Fi from your computer to a mobile device by scanning a QR code without leaving the terminal. You can support development

Claudio d'Angelis 8.9k Nov 24, 2022
Go package for syntax highlighting of code

syntaxhighlight Package syntaxhighlight provides syntax highlighting for code. It currently uses a language-independent lexer and performs decently on

Sourcegraph 253 Nov 18, 2022
Search for Go code using syntax trees

gogrep GO111MODULE=on go get mvdan.cc/gogrep Search for Go code using syntax trees. Work in progress. gogrep -x 'if $x != nil { return $x, $*_ }' In

Daniel Martí 473 Oct 18, 2022
Auto-gen fuzzing wrappers from normal code. Automatically find buggy call sequences, including data races & deadlocks. Supports rich signature types.

fzgen fzgen auto-generates fuzzing wrappers for Go 1.18, optionally finds problematic API call sequences, can automatically wire outputs to inputs acr

thepudds 75 Nov 8, 2022
Pryrite, interactively execute shell code blocks in a markdown file

Pryrite Pryrite is a command line tool that interactively runs executable blocks in a markdown file. One can think of pryrite as a console REPL/debugg

Rama Shenai 167 Nov 9, 2022
Decode / encode XML to/from map[string]interface{} (or JSON); extract values with dot-notation paths and wildcards. Replaces x2j and j2x packages.

mxj - to/from maps, XML and JSON Decode/encode XML to/from map[string]interface{} (or JSON) values, and extract/modify values from maps by key or key-

Charles Banning 531 Nov 16, 2022
Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler

Pagser Pagser inspired by page parser。 Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and str

foolin 72 Nov 14, 2022