Digest 2023-03

Digest 2023-03

@snakers4

____________________________________________________________________________

# ML

The Transformer Family Version 2.0 - https://lilianweng.github.io/posts/2023-01-27-the-transformer-family-v2/

Update #42: AI + News Editors Make Mistakes and New Masked Image Self-Supervised Methods - https://thegradientpub.substack.com/p/update-42-ai-news-editors-make-mistakes

The technology behind GitHub’s new code search - https://github.blog/2023-02-06-the-technology-behind-githubs-new-code-search/

AI Psychosis - https://blog.piekniewski.info/2023/02/07/ai-psychosis/

Yandex scrapes Google and other SEO learnings from the source code leak - https://searchengineland.com/yandex-leak-learnings-392393

Теория вероятностей в машинном обучении. Часть 1: модель регрессии - https://habr.com/ru/company/ods/blog/713920/

Теория вероятностей в машинном обучении. Часть 2: модель классификации - https://habr.com/ru/company/ods/blog/714670/

Trends in the dollar training cost of machine learning systems - https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems

The Inference Cost Of Search Disruption – Large Language Model Cost Analysis - https://www.semianalysis.com/p/the-inference-cost-of-search-disruption

The AI Brick Wall – A Practical Limit For Scaling Dense Transformer Models, and How GPT 4 Will Break Past It - https://www.semianalysis.com/p/the-ai-brick-wall-a-practical-limit

Training Compute-Optimal Large Language Models - https://arxiv.org/pdf/2203.15556.pdf

How Nvidia’s CUDA Monopoly In Machine Learning Is Breaking - OpenAI Triton And PyTorch 2.0 - https://www.semianalysis.com/p/nvidiaopenaitritonpytorch

Peeling The Onion’s Layers - Large Language Models Search Architecture And Cost - https://www.semianalysis.com/p/peeling-the-onions-layers-large-language

The Impact and Future of ChatGPT - https://lastweekin.ai/p/chatgpt-impact

Увеличь это! Современное увеличение разрешения в 2023 - https://habr.com/ru/post/716706/

OpenAI's Whisper is another case study in Colonisation - https://blog.papareo.nz/whisper-is-another-case-study-in-colonisation/

Сколько стоит содержать виртуальную девушку? Создаем подругу, записывающую кружочки в Telegram, с помощью 4 нейросетей - https://habr.com/ru/company/selectel/blog/718134/

10 первых ошибок в карьере ML-инженера - https://habr.com/ru/post/718942/

Пошаговый гайд: как мы ВКонтакте делаем собственный переводчик - https://habr.com/ru/company/vk/blog/718194/

Datasets at your fingertips in Google Search - https://ai.googleblog.com/2023/02/datasets-at-your-fingertips-in-google.html

GPT in 60 Lines of NumPy - https://jaykmody.com/blog/gpt-from-scratch#fnref3

Universal Speech Model (USM): State-of-the-art speech AI for 100+ languages - https://ai.googleblog.com/2023/03/universal-speech-model-usm-state-of-art.html

Must read: the 100 most cited AI papers in 2022 - https://www.zeta-alpha.com/post/must-read-the-100-most-cited-ai-papers-in-2022

Symbolic Discovery of Optimization Algorithms - https://github.com/google/automl/tree/master/lion

Paper Review: Scaling Vision Transformers to 22 Billion Parameters - https://andlukyane.com//blog/paper-review-vit-22

Paper Review: PaLM-E: An Embodied Multimodal Language Model - https://andlukyane.com//blog/paper-review-palme

Paper Review: LLaMA: Open and Efficient Foundation Language Models - https://andlukyane.com/blog/paper-review-llama

Exploring 12 Million of the 2.3 Billion Images Used to Train Stable Diffusion’s Image Generator - https://waxy.org/2022/08/exploring-12-million-of-the-images-used-to-train-stable-diffusions-image-generator/

Announcing OpenChatKit - https://www.together.xyz/blog/openchatkit

GPT-4 - https://openai.com/research/gpt-4

Running LLaMA 7B and 13B on a 64GB M2 MacBook Pro with llama.cpp - https://til.simonwillison.net/llms/llama-7b-m2

Conformer-1 - https://www.assemblyai.com/blog/conformer-1/

Paper Review: ReBotNet: Fast Real-time Video Enhancement - https://andlukyane.com//blog/paper-review-rebotnet

Paper Review: Hyena Hierarchy: Towards Larger Convolutional Language Models - https://andlukyane.com/blog/paper-review-hyena

Twitter's Recommendation Algorithm - https://blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm

The Tale of Bloom Embeddings and Unseen Entities - https://explosion.ai/blog/technical-report

Paper Review: Segment Anything - https://andlukyane.com//blog/paper-review-sam

# Blogs

People always put their money in futures they predict - https://www.strangeloopcanon.com/p/people-always-put-their-money-in

Hustle bros are jumping on the AI bandwagon - https://www.theverge.com/2023/2/2/23582772/chatgpt-ai-get-rich-quick-schemes-hustlers-web

Self Hosting a Google Maps Alternative with OpenStreetMap - https://wcedmisten.fyi/post/self-hosting-osm/

10 удивительно зрелищных простейших клеточных автоматов - https://habr.com/ru/post/718620/

AI, ChatGPT, and Bing … Oh My - https://medium.learningbyshipping.com/ai-chatgpt-and-bing-oh-my-79c47e62c666

The New Gatekeepers - https://www.ben-evans.com/presentations

Please stop writing shell scripts - https://pythonspeed.com/articles/shell-scripts/

The future of the transistor - https://www.semianalysis.com/p/the-future-of-the-transistor

Мнение: DevOps — это раковая опухоль - https://habr.com/ru/post/721646/

Notes from a bank run - https://petewarden.com/2023/03/12/notes-from-a-bank-run/

Reverse Engineering A Mysterious UDP Stream in My Hotel - https://www.gkbrk.com/2016/05/hotel-music/

BIG DATA IS DEAD - https://motherduck.com/blog/big-data-is-dead

OUR $0.02 ON SVB - https://digitstodollars.com/2023/03/14/our-0-02-on-svb/

Retail, search and Amazon’s $40bn ‘advertising’ business - https://www.ben-evans.com/benedictevans/2023/3/6/ways-to-think-about-amazon-advertising

Not even wrong: predicting tech - https://www.ben-evans.com/benedictevans/2020/5/16/not-even-wrong

# Hardware

SEMIS TOP FIVE - https://digitstodollars.com/2023/02/03/semis-top-five/

STICK A FORK IN THEM - https://digitstodollars.com/2023/02/07/stick-a-fork-in-them/

Optane’s Legacy, Part I: New Programming Paradigm and Instructions - https://thessdguy.com/optanes-legacy-part-i-new-programming-paradigm-and-instructions

Amazon’s Cloud Crisis: How AWS Will Lose The Future Of Computing - https://www.semianalysis.com/p/amazons-cloud-crisis-how-aws-will

A TALE OF TWO SALESFORCES - https://digitstodollars.com/2023/03/23/a-tale-of-two-salesforces/

HOW TO SAVE ANDROID - https://digitstodollars.com/2023/03/24/how-to-save-android/

Аналоговые микропроцессоры с искусственным интеллектом. Насколько это реально? - https://habr.com/ru/companies/mvideo/articles/726790/

# Code

Бинарники из Python-файлов: Nuitka-компилятор, обзор и небольшое исследование - https://habr.com/ru/company/sberbank/blog/710690/

Зеркалирование GitHub-проектов в 2023 году - https://habr.com/ru/company/pt/blog/714316/

The fastest way to read a CSV in Pandas - https://pythonspeed.com/articles/pandas-read-csv-fast/

Dictionary Dispatch Pattern in Python - https://martinheinz.dev/blog/90

Python’s multiprocessing performance problem - https://pythonspeed.com/articles/faster-multiprocessing-pickle/

Why I Will Never Use Alpine Linux Ever Again - https://martinheinz.dev/blog/92

Reduce - The Power of a Single Python Function - https://martinheinz.dev/blog/93

Speeding up text processing in Python (is hard) - https://pythonspeed.com/articles/faster-text-processing/

# Datasets

Biggest FOSS image datasets - https://laion.ai/projects/

____________________________________________________________________________

Originally posted on - https://t.me/snakers4


Report Page