Technologique


Гео и язык канала: Весь мир, Английский
Категория: Технологии


Deeply involved developers about various aspects, tendencies & conceptions of programming technologies, FLOSS, Linux, security, cloud infrastructures & DevOps practices, distributed systems, data warehousing & analysis, DL/ML, web3, etc.
Author: @andrcmdr

Связанные каналы

Гео и язык канала
Весь мир, Английский
Категория
Технологии
Статистика
Фильтр публикаций




AI and AGI should be fully open sourced and loyal to builders and community!

The most important thing I should say and add to Steve's blog post is that AI should be open (now we see opposite things - a big tech concentrated AI market), free (as in freedom), monetizable and loyal, for creators/builders/developers good and for community win. And this is OML principle. And target goal of Sentient Foundation, who makes truly open AGI future, and already developed Dobby model (and Dobby is already free! =), Sentient Chat, Sentient OpenDeepSearch, OML Fingerprinting library, Agent Framework and Enclaves Framework (proud to be a leading part of it!).
And all of these parts of groundbreaking product portfolio and breakthroughs are made just within less than a year!
More good things to come! Stay turned!

https://steveklabnik.com/writing/i-am-disappointed-in-the-ai-discourse/

https://www.sentient.xyz

#AI
#AGI
#OpenAGI




Modular provides MAX platform - it is MAX inference backend (engine) and MAX inference server (MAX Serve).

Just look at this:

https://builds.modular.com/models/DeepSeek-R1-Distill-Llama/8B-Q6_K

https://builds.modular.com/models/Llama-3.3-Instruct/70B?tab=deploy

In terms of deployment it is fantastic! Just one (relatively) tiny container!
And in terms of programming - GPU programming and acceleration without CUDA, using Mojo language (statically LLVM compiled), which has capabilities of Rust (static memory safety), LLVM MLIR (Multi-Level Intermediate Representation) byte code compilation for amazing low level code optimization and acceleration, syntax of Python and Mojo integrates (embrace) the whole Python ecosystem. I'm playing with Mojo for quite a while already (and it is best of both worlds - Rust and Python), but MAX just used recently. And Llama.cpp not even in comparison with MAX!

#Mojo
#MAX
#AI
#AGI




https://www.youtube.com/live/AyH7zoP-JOg

Great speech!

The privacy and confidentiality should be a fundamental human right in the information and ubiquitous computations era.

Always think about how your data will be used, what you say, message and what you'll prompt to search engine or AI model, how it can be and will be used, especially against your interests.

#AI
#AGI
#privacy
#confidentiality
#confidential_computing
#CC
#security






OpenAGI summit at ETH Denver event.

The updates from Sentient.xyz:

Oleg Golev presented Sentient Enclaves Framework for Confidential AI applications and Sentient Agents Framework for creating AI agents:

https://youtu.be/Ah5FGrmj81M

A big milestone for our team of Sentient Builders!
Much kudos for the shout-outs! 🙌

#AI
#AGI
#OpenAGI
#TEE
#Enclaves
#AWS
#NitroEnclaves










The final of the story.

How's it started:
https://lore.kernel.org/lkml/20250108122825.136021-1-abdiel.janulgue@gmail.com/

How's it going:
https://lore.kernel.org/lkml/20250224-configfs-v4-0-9af9b5e611f6@kernel.org/

How's it ended:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/MAINTAINERS?id=815291c11acda54515f1af5ce6fe307490de9127

Moral:
Don't resist to positive valuable changes and don't be a brake for others and for progress.

Rust to the Linux! 🐧
Rust to the Moon! 🚀

#Rust
#RustLang
#Linux




The drama around Rust drivers inclusion into Linux kernel became worser and even worser.

Old kernel maintainer against of Rust code inclusion into DMA for drivers.
Asahi Linux maintainer resigned from Linux kernel maintainers.

The question is - what is better, maintain two languages and handle this complexity, one of it, Rust, provided memory safety, when usafe blocks are strictly located and controlled, or, continue to maintain old (and new) error prone code in C language with all that memory safety issues, as it was before, for many years, without accepting new approaches to develop safe system software, drivers and kernel modules.

Overall, that's the problem of old minded people, who doesn't want to put the efforts to accept something new, which can be better than old approach.

I've got only one question - if there are so much resistance from old minded people, kernel maintainers, for what the Rust for Linux subsystem was even included into the kernel, what old kernel maintainers and Linus himself expected when they've done that, i.e. introduction of Rust subsystem into kernel? They want to keeping it in cage of just subsystem for drivers development, when the problem is the whole old approach to memory management?

If this situation will be progressing (and it will!) many new wave kernel maintainers and kernel developers will leave and resign from Linux kernel. And that's not good for Linux, as maintainers became older and older and it's became tremendously hard to involve new generation of system developers into kernel development and maintenance.
Many new wouldn't even join Linux kernel, because of such toxic environment in conversations, and the better option here for new wave system developeers is to consolidate common efforts on some alternative kernel, like RedoxOS microkernel for example, which is a Unix POSIX compatible kernel, fully written in Rust, and someday, in 10-20 years from now, when Linux gets old and cost of maintenance of old error prone C code will be high (same as with assembly language kernels in previous age), Redox will became a Linux alternative, same as BSD kernels and distros, but better.

https://lore.kernel.org/lkml/20250207-rm-maint-v1-1-10f069a24f3d@marcan.st/

https://www.theregister.com/2025/02/05/mixing_rust_and_c_linux/

https://www.theregister.com/2025/02/07/linus_torvalds_rust_driver/

#Rust
#RustLang
#Linux
#kernel


Here we go again!

SEV-SNP is vulnerable, again.

New AMD SEV-SNP vulnerability:

https://github.com/google/security-research/security/advisories/GHSA-4xq7-4mgh-gp6w

Exploit:

https://github.com/google/security-research/tree/master/pocs/cpus/entrysign

Reports about two recent vulnerabilities in SEV-SNP memory encryption and isolation mechanism, on CPU pipeline, cache and branch prediction level:

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-3019.html

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-3010.html

AMD reported that previous approaches to Spectre class attacks will work to fix new vulnerabilities:

https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning-guides/software-techniques-for-managing-speculation.pdf

TLDR:

This is related to TEE technology for confidential VMs (cVMs) wide used for confidential apps and data handling on servers, AMD SEV-SNP. Intel has similar technology for cVMs, TDX. AWS Nitro Enclaves which is widely used based on (Firecracker VMs and) EC2 instances wtth SNP or TDX support, depending on instance class.

This is memory isolation and encryption technologies for VMs and apps in it.
Vulnerability affects sensitive data in enclaves memory via side-channel attack, on CPU instructions pipelining, branch prediction and TLB cache (translation of virtual addresses of memory to real addresses of RAM). Thus sensitive data can leak unencrypted through processor cache, while handling by some privileged process (kernel, KVM subsystem, hypervisor), which can be read by other processes in a system (with escalation of privileges).
Thus it's a Meltdown/Spectre class/level vulnerability for CPU superscalar architecture.
Only solution here is disabling branch prediction on a CPU microcode level, via UEFI BIOS patch and patching (Linux) kernel with disabling IPC mechanisms for caching context between processes, which led to more kernel space (privileges) versus user space context switching and decrease of performance tremendously.

But this particular vulnerability is related and affects the microcode updates uploading itself, cracking microcode signature verification attacker (gained access to a host machine with ring0 privilege level, as root user and via kernel space rootkit, for example) can upload malicious microcode locally, which will gather confidential data from cVM processes via side-channels, from CPU cache.

The root cause - is insecure hash function in CPU microcode for microcode signature verification before updating itself, so signature hash can be spoofed this way.

#cVM
#TEE
#SEV
#SNP
#SEV_SNP
#AMD


I also got a question - from where DeepSeek mined and gathered all of the data for data-set to train R1 model? Especially for such a short period of time!

For Facebook/MetaAI with LLama, Microsoft/OpenAI with GPT4o/o1/o3, Google AI (+ teams from Google Brain and DeepMind) with Gemini (former Bard) - it's naturally sourced either from search engine index from web, or social networks user data.

Some independent players like Anthropic with Claude (made by former OpenAI engineers, where Amazon as a largest investor),
Mistral AI (made by former Google DeepMind and Meta AI engineers) with open sourced models Mistal and Mixtral (mixture of experts, MoE, architecture, similar to what DeepSeek use) - they've got their own data sources.

Just think about it - DeepSeek R1 has really good quality of responses and reasoning, this means the data-set and reinforcement learning processes they use (explained in the paper: https://arxiv.org/pdf/2501.12948, https://raw.githubusercontent.com/deepseek-ai/DeepSeek-R1/refs/heads/main/DeepSeek_R1.pdf) are prepared and tuned in an really outstanding level!

(And now OpenAI states that DeepSeek steals data from their API and then uses distilled training based on their data.)


Holly Spirit!

Have tried out DeepSeek R1 now!

And it's so much creepy scary! 😱

Especially the speed (tokens per second) and reasoning, it reasoning, i.e. thinking our loud, like a good old Unix programmer! 😱

I've feed it some sample of code I wrote recently for enclave's init system (just about 400 SLoC in Rust), I was rewrote it from C to Rust, and asking about the ways of improvements in handling Unix signals for processes in Rust in an idiomatic POSIX way but using Rust standard library. Overall it should be the fully fledged Init system for Linux residential processes executed and existed in enclaves.

And you know what?

OpenAI GPT4o cannot solve this - always thinking for too long, making output, but not precisely to my prompts. And then asks to pay for subscription ('cause time and tokens limits exceeded). 😂 Probably OpenAI do this intentionally to be more commercially efficient. Just a wasting of time and money.
But I subscribed out of curiosity.

OpenAI GPT o1 - already with subscription. Same things. It cannot solve prompt to the fullest. Just 400 SLoC to analyze from source code and it always stops, asking for refinements and in final not giving fullest results, just code snippets, that aren't helpful, more like a hallucination (do not use any substance and make code! 😂).

LLama 3.1 70B self-hosted - works good. Not reasoning perfectly but gives meaningful hints. Downsides - also always stops, asking for refinements and cycling, this dialogue never end and you've never reach the final meaningful complex result. Code snippets with examples are helpful. Can be use as fast search engine for code samples with context for current task. Overall helpful.

DeepSeek R1:
It's mind-blowing! 😱
One precise prompt.
Precise analysis, weak and good points, snippets, examples.
Precise reasoning, thinking out loud, as me talking with my computer science teacher in University.
And speed - it's blazing fast! Tokens per second performance is way much faster than others, even visually!
By one full run I've got all the answers for my questions.
Best pair programming session with AI overall!
GitHub Co-Pilot sucks in comparison to R1!

This shows us that even in corporate monopoly market small companies can make big shifts, bring big difference and value, can innovate, and outperform giants.

My thoughts:
We're all will be replaced by such AI creatures! 😱

Joking! We can collaborate and create beautiful things! World is definitely changing now! We can adapt and adopt these technologies, and use them for the great good! (And I'm still believe in bright future.)

Overall, LLMs as neural network has inputs and outputs, and as an input for now it requires operator, engineer, human. It cannot make goal-setting via prompting itself! (At least for now!)

It's and interesting case and pair programming is so good application for reasoning LLMs and SLMs!

Paper:
https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf

Try it by yourself:

https://chat.deepseek.com

https://github.com/deepseek-ai/DeepSeek-R1?tab=readme-ov-file#6-how-to-run-locally

https://github.com/deepseek-ai/DeepSeek-V3

Weights in tensorlayers HF format:
https://huggingface.co/deepseek-ai/DeepSeek-R1

Inference runners:
https://github.com/vllm-project/vllm

https://docs.vllm.ai/en/latest/serving/distributed_serving.html

https://github.com/sgl-project/sglang


For comparison with accessible alternatives:

LLama 3.1 70B chat service:
https://duck.ai


https://chatgpt.com

https://claude.ai


And try it out to deploy by yourself LLama 3.1/3.3 70B/405B via self-hosted installation with some custom inference runner (llama.cpp for example, or its Rust bindings) or cloud deploy from HuggingFace, and compare:
https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct
https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct
https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct

https://github.com/ggerganov/llama.cpp
https://github.com/mdrokz/rust-llama.cpp
https://github.com/utilityai/llama-cpp-rs


#AI
#AGI
#LLM


DeepSeek R1 model release - disrupting of corporate monopoly and "Star Gate" project plan.

https://www.youtube.com/watch?v=WEBiebbeNCA

The big deal is that model R1 from DeepSeek is open sourced.

While LLama 3.1 for example it's still mainly closed, not fully open sourced (only model tensor layers in portable standardized format, runner for inference, and partially datasets for fine-tuned special models of LLama) and FSF recently evaluated Meta's LLama community license as non-free:
https://www.fsf.org/blogs/licensing/llama-3-1-community-license-is-not-a-free-software-license

While Meta stated about given freedom of usage and modifications:
https://ai.meta.com/blog/meta-llama-3/

Obviously, non-ethic usage possible (in medicine, bioinformatics, military, and as cyber weapon, etc.).

But this also restricts builders with honest usage from creation of other products faster and gives control over progress in AI only to large corporations and big-tech.

While model from DeepSeek has been given for free, as in freedom.

This shifts balance in current monopoly and gives powerful tool to builders, everyone who wants to create services.

Also this revealing the recent half trillion investing into "Stargate" project plan as inefficient.

Overall it's a good step forward in #AI world.

Links:

https://github.com/deepseek-ai/DeepSeek-R1?tab=readme-ov-file#6-how-to-run-locally

https://github.com/deepseek-ai/DeepSeek-R1?tab=readme-ov-file#7-license

https://huggingface.co/deepseek-ai/DeepSeek-R1

https://huggingface.co/deepseek-ai/DeepSeek-R1-Zero

Показано 20 последних публикаций.