101 stories
·
0 followers

Introduction#

1 Share

Module Federation is an architectural pattern for the decentralization of JavaScript applications (similar to microservices on the server-side). It allows you to share code and resources among multiple JavaScript applications (or micro-frontends). This can help you:

  • Reduce code duplication
  • Improve code maintainability
  • Lower the overall size of your applications
  • Enhance the performance of your applications

✨ What is Module Federation 2.0?#

Module Federation 2.0 differs from the Module Federation built into Webpack5 by providing not only the core features of module export, loading, and dependency sharing but also additional dynamic type hinting, Manifest, Federation Runtime, and Runtime Plugin System. These features make Module Federation more suitable for use as a micro-frontend architecture in large-scale Web applications.

🔥 Features#

Module Federation has the following features:

🎯 Use Cases#

Module Federation is suitable for the following scenarios:

  • Large Applications: For large applications, you can break the application into multiple micro-frontends and use Module Federation to share code and resources between them.
  • Microfrontend Architecture: Module Federation is an ideal tool for building microfrontend architectures.
  • Multi-team Development: Module Federation can assist multiple teams in collaboratively developing large applications.

🕠 History of Module Federation#

Module Federation is a new feature introduced in Webpack 5, but its history dates back to 2017. At that time, the Webpack team began exploring a way to share code between multiple applications.

  • In 2018, Webpack 4.20 was released, introducing module hooks, which laid the foundation for the development of Module Federation.

  • In 2019, Webpack 5 was released, officially introducing the Module Federation feature.

Module Federation has become a powerful tool for building modern web applications.

🕰️ The Future of Module Federation#

Module Federation aims to become an architectural method for building large web applications, similar to microservices in the backend. Module Federation will provide more capabilities to meet the foundational needs of large web application decentralization, currently including these parts:

  • Providing comprehensive Devtool tools
  • Offering more high-level framework capabilities like Router, Sandbox, SSR
  • Providing best practices for large web applications based on Module Federation

Follow Us#

✨ Next Steps#

You might want to:

Read the whole story
bernhardbock
5 hours ago
reply
Share this story
Delete

GitHub - PriorLabs/TabPFN

1 Share

Official installation (pip)

OR installation from source

pip install "tabpfn @ git+https://github.com/PriorLabs/TabPFN.git"

OR local development installation

git clone <a href="https://github.com/PriorLabs/TabPFN.git" rel="nofollow">https://github.com/PriorLabs/TabPFN.git</a>
pip install -e "TabPFN[dev]"
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.model_selection import train_test_split from tabpfn import TabPFNClassifier # Load data
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=42) # Initialize a classifier
clf = TabPFNClassifier()
clf.fit(X_train, y_train) # Predict probabilities
prediction_probabilities = clf.predict_proba(X_test)
print("ROC AUC:", roc_auc_score(y_test, prediction_probabilities[:, 1])) # Predict labels
predictions = clf.predict(X_test)
print("Accuracy", accuracy_score(y_test, predictions))
from sklearn.datasets import fetch_openml
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split # Assuming there is a TabPFNRegressor (if not, a different regressor should be used)
from tabpfn import TabPFNRegressor # Load Boston Housing data
df = fetch_openml(data_id=531, as_frame=True) # Boston Housing dataset
X = df.data
y = df.target.astype(float) # Ensure target is float for regression # Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=42) # Initialize the regressor
regressor = TabPFNRegressor() regressor.fit(X_train, y_train) # Predict on the test set
predictions = regressor.predict(X_test) # Evaluate the model
mse = mean_squared_error(y_test, predictions)
r2 = r2_score(y_test, predictions) print("Mean Squared Error (MSE):", mse)
print("R² Score:", r2)

For optimal performance, use the AutoTabPFNClassifier or AutoTabPFNRegressor for post-hoc ensembling. These can be found in the TabPFN Extensions repository. Post-hoc ensembling combines multiple TabPFN models into an ensemble.

Steps for Best Results:

  1. Install the extensions:

    git clone <a href="https://github.com/priorlabs/tabpfn-extensions.git" rel="nofollow">https://github.com/priorlabs/tabpfn-extensions.git</a>
    pip install -e tabpfn-extensions
  2. from tabpfn_extensions.post_hoc_ensembles.sklearn_interface import AutoTabPFNClassifier clf = AutoTabPFNClassifier(max_time=120, device="cuda") # 120 seconds tuning time
    clf.fit(X_train, y_train)
    predictions = clf.predict(X_test)

Choose the right TabPFN implementation for your needs:

  • TabPFN Client
    Simple API client for using TabPFN via cloud-based inference.

  • TabPFN Extensions
    A powerful companion repository packed with advanced utilities, integrations, and features - great place to contribute:

    • 🔍 interpretability: Gain insights with SHAP-based explanations, feature importance, and selection tools.
    • 🕵️‍♂️ unsupervised: Tools for outlier detection and synthetic tabular data generation.
    • 🧬 embeddings: Extract and use TabPFN’s internal learned embeddings for downstream tasks or analysis.
    • 🧠 many_class: Handle multi-class classification problems that exceed TabPFN's built-in class limit.
    • 🌲 rf_pfn: Combine TabPFN with traditional models like Random Forests for hybrid approaches.
    • ⚙️ hpo: Automated hyperparameter optimization tailored to TabPFN.
    • 🔁 post_hoc_ensembles: Boost performance by ensembling multiple TabPFN models post-training.

    ✨ To install:

    git clone <a href="https://github.com/priorlabs/tabpfn-extensions.git" rel="nofollow">https://github.com/priorlabs/tabpfn-extensions.git</a>
    pip install -e tabpfn-extensions
  • TabPFN (this repo)
    Core implementation for fast and local inference with PyTorch and CUDA support.

  • TabPFN UX
    No-code graphical interface to explore TabPFN capabilities—ideal for business users and prototyping.

Prior Labs License (Apache 2.0 with additional attribution requirement): here

We're building the future of tabular machine learning and would love your involvement:

  1. Connect & Learn:

  2. Contribute:

    • Report bugs or request features
    • Submit pull requests
    • Share your research and use cases
  3. Stay Updated: Star the repo and join Discord for the latest updates

You can read our paper explaining TabPFN here.

@article{hollmann2025tabpfn, title={Accurate predictions on small data with a tabular foundation model}, author={Hollmann, Noah and M{\"u}ller, Samuel and Purucker, Lennart and
 Krishnakumar, Arjun and K{\"o}rfer, Max and Hoo, Shi Bin and
 Schirrmeister, Robin Tibor and Hutter, Frank}, journal={Nature}, year={2025}, month={01}, day={09}, doi={10.1038/s41586-024-08328-6}, publisher={Springer Nature}, url={<a href="https://www.nature.com/articles/s41586-024-08328-6" rel="nofollow">https://www.nature.com/articles/s41586-024-08328-6</a>},
} @inproceedings{hollmann2023tabpfn, title={TabPFN: A transformer that solves small tabular classification problems in a second}, author={Hollmann, Noah and M{\"u}ller, Samuel and Eggensperger, Katharina and Hutter, Frank}, booktitle={International Conference on Learning Representations 2023}, year={2023}
}

Q: What dataset sizes work best with TabPFN?
A: TabPFN is optimized for datasets up to 10,000 rows. For larger datasets, consider using Random Forest preprocessing or other extensions. See our Colab notebook for strategies.

Q: Why can't I use TabPFN with Python 3.8?
A: TabPFN v2 requires Python 3.9+ due to newer language features. Compatible versions: 3.9, 3.10, 3.11, 3.12, 3.13.

Q: How do I use TabPFN without an internet connection?

TabPFN automatically downloads model weights when first used. For offline usage:

Using the Provided Download Script

If you have the TabPFN repository, you can use the included script to download all models (including ensemble variants):

# After installing TabPFN
python scripts/download_all_models.py

This script will download the main classifier and regressor models, as well as all ensemble variant models to your system's default cache directory.

Manual Download

  1. Download the model files manually from HuggingFace:

  2. Place the file in one of these locations:

    • Specify directly: TabPFNClassifier(model_path="/path/to/model.ckpt")
    • Set environment variable: os.environ["TABPFN_MODEL_CACHE_DIR"] = "/path/to/dir"
    • Default OS cache directory:
      • Windows: %APPDATA%\tabpfn\
      • macOS: ~/Library/Caches/tabpfn/
      • Linux: ~/.cache/tabpfn/

Q: I'm getting a pickle error when loading the model. What should I do?
A: Try the following:

  • Download the newest version of tabpfn pip install tabpfn --upgrade
  • Ensure model files downloaded correctly (re-download if needed)

Q: Can TabPFN handle missing values?
A: Yes!

Q: How can I improve TabPFN’s performance?
A: Best practices:

  • Use AutoTabPFNClassifier from TabPFN Extensions for post-hoc ensembling
  • Feature engineering: Add domain-specific features to improve model performance
    Not effective:
    • Adapt feature scaling
    • Convert categorical features to numerical values (e.g., one-hot encoding)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
git clone <a href="https://github.com/PriorLabs/TabPFN.git" rel="nofollow">https://github.com/PriorLabs/TabPFN.git</a>
cd tabpfn
pip install -e ".[dev]"
pre-commit install
pre-commit run --all-files

Built with ❤️ by Prior Labs - Copyright (c) 2025 Prior Labs GmbH

You can’t perform that action at this time.

Read the whole story
bernhardbock
5 hours ago
reply
Share this story
Delete

Minimal CSS-only blurry image placeholders

1 Share
Read the whole story
bernhardbock
7 hours ago
reply
Share this story
Delete

qdm12/gluetun: VPN client in a thin Docker container for multiple VPN providers, written in Go, and using OpenVPN or Wireguard, DNS over TLS, with a few proxy servers built-in.

1 Share

You can’t perform that action at this time.

Read the whole story
bernhardbock
19 days ago
reply
Share this story
Delete

How MIG maximizes GPU efficiency on OpenShift AI | Red Hat Developer

1 Share

Modern data science workloads demand high computational power, and Graphic Processing Units (GPUs) are often at the heart of these operations. However, sharing GPU resources efficiently among multiple users or workloads can be challenging. NVIDIA Multi-Instance GPU (MIG) technology offers a solution. This article explores how I tested MIG on Red Hat OpenShift AI using an NVIDIA Ampere architecture GPU and the benefits for AI and data science teams.

The NVIDIA MIG solution and test

GPUs in a Kubernetes environment are assigned to pods in a 1:1 ratio by default. This means a single GPU is dedicated to one pod, regardless of whether the workload fully utilizes the GPU’s capacity. This limitation can lead to inefficient resource usage, especially for smaller workloads. NVIDIA MIG solves this issue by splitting a single GPU into multiple independent instances to be used by different pods. This feature maximizes GPU utilization and ensures resources are not wasted. In the next sections, I will demonstrate how I tested MIG on Red Hat OpenShift AI.

Prepare the environment

For this test, certain preparatory steps are required to leverage MIG on OpenShift. I used Azure’s Standard_NC24ads_A100_v4 virtual machine (VM), equipped with an NVIDIA A100 PCIe 80GB GPU as an OpenShift worker (Figure 1).

Step 1: Install NFD

First, I installed the Node Feature Discovery (NFD) operator, as shown in Figures 2 and 3.

This operator detects hardware features and ensures that GPUs are discoverable by the NVIDIA GPU operator.

We will see many labels added to the node, indicating the operator detects its GPU:

$ oc describe node/ods-cluster-mqt7l-worker-eastus2-fn5w8
                        Labels:             beta.kubernetes.io/arch=amd64
                                          feature.node.kubernetes.io/cpu-cpuid.ADX=true
                                          feature.node.kubernetes.io/cpu-cpuid.AESNI=true
                                          ...
                                          feature.node.kubernetes.io/cpu-cpuid.FMA3=true
                                          feature.node.kubernetes.io/gpu.present=true
                                          feature.node.kubernetes.io/gpu.memory=80GB
                                          feature.node.kubernetes.io/gpu.vendor=nvidia
                                          feature.node.kubernetes.io/gpu.model=A100

Step 2: Install the NVIDIA GPU operator

Next, I installed the NVIDIA GPU operator, which handles the configuration of GPU resources (Figure 4).

I made sure to enable the MIG manager in the ClusterPolicy configuration to facilitate the MIG setup (Figure 5).

Step 3: Check the pods

There are two ways to make sure all pods under the nvidia-gpu-operator namespace are up and running:

  1. From the CLI:

    $ oc get pods -n nvidia-gpu-operator
  2. From the console, as shown in Figure 6:

Choose the right MIG configuration

MIG offers a variety of configurations tailored to different GPU models and workload requirements. You have to understand which configurations are supported for the NVIDIA A100–80GB GPU. For example, I ran the command oc describe configmap/default-mig-parted-config, explored the available configurations, and selected one that matched my requirements.1g.10gb, which divides the GPU into seven instances.

The following configuration is ideal for workloads that require smaller, dedicated slices of GPU power.

    # H100-80GB, H800-80GB, A100-80GB, A800-80GB, A100-40GB, A800-40GB
     all-1g.10gb:
       # H100-80GB, H800-80GB, A100-80GB, A800-80GB
       - device-filter: ["0x233010DE", "0x233110DE", "0x232210DE", "0x20B210DE", "0x20B510DE", "0x20F310DE", "0x20F510DE", "0x232410DE"]
         devices: all
         mig-enabled: true
         mig-devices:
           "1g.10gb": 7

Enable and verify MIG

To verify the setup, I used the nvidia-smi tool to query the GPU status and configurations. When MIG was initially disabled, I enabled it and restarted the node:

sh-4.4# nvidia-smi -i 0 -mig 1
                        Enabled MIG Mode for GPU 00000001:00:00.0
                        All done.

To verify that MIG is enabled for the GPU, I connected to the nvidia-mig-manager pod in OpenShift and used the terminal tab to query GPU=0 configurations with the following command:

sh-4.4#
                        sh-4.4# nvidia-smi -i 0 -q
                        ==============NVSMI LOG==============
                        Timestamp                           : Tue Dec  5 15:41:13 2023
                        Driver Version                      : 535.104.12
                        CUDA Version                        : Not Found
                        Attached GPUs                       : 1
                        GPU 00000001:00:00.0
                            Product Name                    : NVIDIA A100 80GB PCIe
                            Product Brand                   : NVIDIA
                            Product Architecture            : Ampere
                            Display Mode                    : Enabled
                            Display Active                  : Disabled
                            Persistence Mode                : Enabled
                            Addressing Mode                 : None
                            MIG Mode
                                Current                     : Enabled
                                Pending                     : Enabled

After selecting the configuration, I labeled the node with the following command:

$ oc label node <node-name> nvidia.com/mig.config=all-1g.10gb --overwrite

The MIG manager pod logs insights into the status of the node labeling process (Figure 7).

Once successful, the node reported multiple allocatable GPUs instead of a single one.

Let's describe the node to confirm that it recognizes seven GPUs:

$ oc describe node/ods-cluster-mqt7l-worker-eastus2-fn5w8
                        Capacity:
                          attachable-volumes-azure-disk: 8
                          cpu: 24
                          ephemeral-storage: 133682156Ki
                          hugepages-1Gi: 0
                          hugepages-2Mi: 0
                          memory: 226965748Ki
                          nvidia.com/gpu: 7
                          pods: 250
                        Allocatable:
                          attachable-volumes-azure-disk: 8
                          cpu: 23500m
                          ephemeral-storage: 122127732942
                          hugepages-1Gi: 0
                          hugepages-2Mi: 0
                          memory: 225814772Ki
                          nvidia.com/gpu: 7
                          pods: 250

Consume the sliced GPUs via Red Hat OpenShift AI

With MIG enabled, the OpenShift AI dashboard reflected the increased availability of GPU resources. I could select up to seven GPUs for my workbench (Figure 8). This setup empowers AI and data science teams to run diverse workloads simultaneously without bottlenecks.

Unlock GPU potential with NVIDIA MIG and OpenShift AI

NVIDIA MIG technology, integrated with Red Hat OpenShift AI, transforms GPU resource management by facilitating scalable and efficient workloads. By partitioning GPUs into smaller, independent units, organizations can achieve maximum resource utilization, cost savings, and streamlined AI/ML operations. MIG on OpenShift AI helps teams fully harness the power of GPU technology, whether they manage diverse workloads or scale multi-user environments.

Learn more about using NVIDIA NIM on Red Hat OpenShift AI and the performance results shown by Red Hat AI Performance and Scale when testing NVIDIA GPUs with MIG.

Read the whole story
bernhardbock
55 days ago
reply
Share this story
Delete

Dumping packets from anywhere in the networking stack | Red Hat Developer

1 Share

Dumping traffic on a network interface is one of the most performed steps while debugging networking and connectivity issues. On Linux, tcpdump is probably the most common way to do this, but some use Wireshark too.

Where does tcpdump get the packets from?

Internally, both tcpdump and Wireshark use the Packet Capture (pcap) library. When capturing packets, a socket with the PF_PACKET domain is created (see man packet) which allows you to receive and send packets at the layer 2 from the OSI model.

From libpcap:

sock_fd = is_any_device ?
       socket(PF_PACKET, SOCK_DGRAM, 0) :
       socket(PF_PACKET, SOCK_RAW, 0);

Note that the last parameter in the socket call is later set to a specific protocol, or ETH_P_ALL if none is explicitly provided. The latter makes all packets to be received by the socket.

This allows to get packets directly after the device driver in ingress, without any change being made to the packet, and right before entering the device driver on egress. Or to say it differently packets are seen between the networking stack and the NIC drivers.

Limitations

While the above use of PF_PACKET works nicely, it also comes with limitations. As packets are retrieved from a very specific and defined place of the networking stack, they can only be seen in the state they were at that point, e.g., on ingress packets are seen before being processed by the firewall or qdiscs, and the opposite is true on egress.

Offline analysis

By default, tcpdump and Wireshark process packets live at runtime. But they can also store the captured packets data to a file for later analysis (-w option for tcpdump). The pcap file format (application/vnd.tcpdump.pcap) is used. Both tools (and others, e.g., tshark), support reading pcap formatted files.

How to capture packets from other places?

Retrieving packets from other places of the networking stack using tcpdump or Wireshark is not possible. However, other initiatives emerged and targeted monitoring traffic within a single host, like Retis (documentation).

Retis is a recently released tool aiming at improving visibility into the Linux networking stack and various control and data paths. It allows capturing networking-related events and providing relevant context using eBPF, with one notable feature being capturing packets on any (packet-aware—AKA socket buffer) kernel function and tracepoint.

To capture packets from the net:netif_receive_skb tracepoint:

$ retis collect -c skb -p net:netif_receive_skb
4 probe(s) loaded
4581128037918 (8) [irq/188-iwlwifi] 1264 [tp] net:netif_receive_skb
 if 4 (wlp82s0) 2606:4700:4700::1111.53 > [redacted].34952 ttl 54 label 0x66967 len 79 proto UDP (17) len 71

Note that Retis can capture packets from multiple functions and tracepoints by using the above -p option multiple times. It can even identify packets and reconstruct their flow! To get a list of compatible functions and tracepoints, use retis inspect -p.

Also it should be noted that by default tcpdump and Wireshark put devices on promiscuous mode when dumping packets from a specific interface. This is not the case with Retis. An interface can be set in this mode manually by using ip link set <interface> promisc on.

In addition to the above, another tool provides a way to capture packets and convert them to a pcap file: bpftrace. It is a wonderful tool but is more low-level and requires to you write the probe definitions by hand and for compilation of the BPF program to take place on the target. Here the skboutput function can be used, as shown in the help.

Making the link

That's nice, but while Retis is a powerful tool when used standalone, we might want to use the existing tcpdump and Wireshark tools but with packets captured from other places of the networking stack.

This can be done by using the Retis pcap post-processing command. This works in two steps: first Retis can capture and store packets, and then post-process them. The pcap sub-command allows converting Retis saved packets to a pcap format. This can then be used to feed existing pcap-aware tools, such as tcpdump and Wireshark:

$ retis collect -c skb -p net:netif_receive_skb -p net:net_dev_start_xmit -o
$ retis print
4581115688645 (9) [isc-net-0000] 12796/12797 [tp] net:net_dev_start_xmit
 if 4 (wlp82s0) [redacted].34952 > 2606:4700:4700::1111.53 ttl 64 label 0x79c62 len 59 proto UDP (17) len 51
4581128037918 (8) [irq/188-iwlwifi] 1264 [tp] net:netif_receive_skb
 if 4 (wlp82s0) 2606:4700:4700::1111.53 > [redacted].34952 ttl 54 label 0x66967 len 79 proto UDP (17) len 71

$ retis pcap --probe net:net_dev_start_xmit | tcpdump -nnr -
01:31:55.688645 IP6 [redacted].34952 > 2606:4700:4700::1111.53: 28074+ [1au] A? <a href="http://redhat.com" rel="nofollow">redhat.com</a>. (51)

$ retis pcap --probe net:netif_receive_skb -o retis.pcap
$ wireshark retis.pcap

As seen above, Retis can collect packets from multiple probes during the same session. All packets seen on a given probe can then be filtered and converted to the pcap format.

When generating pcap files, Retis adds a comment in every packet with a description of the probe the packet was retrieved on:

$ capinfos -p retis.pcap
File name:           retis.pcap
Packet 1 Comment:    probe=raw_tracepoint:net:netif_receive_skb

In many cases, tools like tcpdump and Wireshark are sufficient. But, due to their design, they can only dump packets from a very specific place of the networking stack, which in some cases can be limiting. When that's the case it's possible to use more recent tools like Retis, either standalone or in combination with the beloved pcap aware utilities to allow using familiar tools or easily integrate this into existing scripts.

Read the whole story
bernhardbock
76 days ago
reply
Share this story
Delete
Next Page of Stories