Sage @sage

**kamatahvel** @kamatahvel@infosec.exchange · 6d

kamatahvel @kamatahvel@infosec.exchange

Hi #GetFediHired, I'm looking for a #remote role in the US (or #sweden if you provide visa assistance!).

I've worked mostly in #SoftwareEngineering space, but I do lean closer to the #DataEngineering side of things (past 3 years). Before that I was varying levels of doing SWE things inside a #BusinessIntelligence role (~5 years).

Looking for something that demands strong #Python skills (~5+ years of heavy, daily use), though wouldn't mind having to learn something new. Quite comfortable in a few #SQL flavors. I can actually read most #regex, if that's a thing worth bragging about. Love writing #xpath in personal webscraping projects. Somewhat familiar with #SpringBoot and #Kotlin (1 year, occasional) and would like to eventually use more Java in work, but not a hard requirement.

I love refactoring/improving old code, and I have lots of experience with CI/CD, coding best practices, testing, web scraping, backend (Flask) & frontend (React, Typescript).

Send me a message if this sounds like I'd be a great fit on your team!

**Sarah Lea** @Sarah_Lea@techhub.social · Jul 28

Jul 28

Sarah Lea @Sarah_Lea@techhub.social

Why normalize databases?
Yesterday, my tutoring student asked me why databases need to be normalized at all. She said: “Wouldn’t it be easier to just have one big table with all the information?”

It’s a common first question when learning about relational databases.
At first, one big table (e.g. customer name, order date, product name, price) seems easiest.

I told her:
Because that quickly leads to data redundancy, anomalies, and integrity issues when inserting, updating, or deleting records.
Normalization means structuring data into separate, related tables, so that each fact is stored only once. This reduces redundancy & preserves consistency.

#databases #dataengineering #datascience

**Posit** @Posit@fosstodon.org · Jul 22

Jul 22

Posit @Posit@fosstodon.org

What makes tools truly useful?

Episode 2 of #TheTestSet features Wes McKinney (Part 1of 2!) sharing his experience building Pandas & Arrow, plus his surprising past in speedrun communities.

Tune in for his story at thetestset.co, on Spotify, or Apple Podcasts

#DataStack #DataEngineering #OpenSource

**HackerNoon** @hackernoon@mas.to · Jul 17

Jul 17

HackerNoon @hackernoon@mas.to

Discover how CocoIndex transforms data orchestration with a pure Data Flow Programming model — ensuring traceable, immutable, and declarative pipelines for know https://hackernoon.com/redefining-data-operations-with-data-flow-programming-in-cocoindex-u486ao8 #dataengineering

hackernoon.comRedefining Data Operations With Data Flow Programming in CocoIndex | HackerNoonDiscover how CocoIndex transforms data orchestration with a pure Data Flow Programming model — ensuring traceable, immutable, and declarative pipelines for know

**Posit** @Posit@fosstodon.org · Jul 15

Jul 15

Posit @Posit@fosstodon.org

Ever wonder about the mind behind Pandas & Apache Arrow? Ep. 2 of #TheTestSet (Part 1!) unpacks Wes McKinney's journey – including his speedrunning past! What makes good tools good?

Listen at https://thetestset.co, on Spotify, or Apple Podcasts

#DataStack #DataEngineering #Pandas

**Will Hopkins** @willhopkins@a2mi.social · Jul 15

Jul 15

Will Hopkins @willhopkins@a2mi.social

#dataengineering If you needed to use a data lake with Redshift, would you use Iceberg, given some native support, over Delta Lake, which is arguably a better format?

Asking for a friend who is me

**Rami Krispin** @ramikrispin@mstdn.social · Jul 5

Jul 5

Rami Krispin @ramikrispin@mstdn.social

My weekly newsletter is out!

This week's agenda:
Open Source of the Week - The dagster project
New learning resources - Forecasting with linear regression, multi-model LLM, multiprocessing with Python
Book of the week - Visualization for Social Data Science by Roger Beecham

Join 29k subscribers and subscribe to get weekly updates
https://ramikrispin.substack.com/p/the-dagster-project-visualization

Rami's Data Newsletter · Jul 5The Dagster Project, Visualization for Social Data Science, Forecasting with Linear RegressionBy Rami Krispin

#DataScience #DataEngineering #Python

**James Bartlett** @JamesDBartlett3@techhub.social · Jun 30

Jun 30

James Bartlett @JamesDBartlett3@techhub.social

One does not simply build reports on OLTP data…

This week on The Drill Down with Ahmad & James, our special guest
Kristyna Ferris will be presenting a session titled "The Fellowship of the Star Schema: Transforming OLTP Data for Power BI"

This session is packed with:
- Clear distinctions between OLTP & OLAP
- Tips for building Power BI-ready models
- A sprinkle of Slowly Changing Dimension magic

Whether you’re a data wizard , business hobbit , or SQL ranger — this is your quest.

Join us LIVE on LinkedIn | Wednesday, July 2nd @ 2PM Central
https://lnkd.in/eWh4SsBb

#TheDrillDown #MicrosoftFabric #PowerBI

**⚯ Michel de Cryptadamus ⚯** @cryptadamist@universeodon.com · Jun 26 *

Jun 26 *

⚯ Michel de Cryptadamus ⚯ @cryptadamist@universeodon.com

pro tip for user interface designers:

if you have hundreds of millions of dollars of venture capital and you want to make a user facing data analytics tool of some kind and you think it's reasonable to ask an average human being to type this:

CAST('2023-05-01' AS TIMESTAMP)

to do literally anything with a date or time in your application's user interface, just stop right there. do not pass go, do not collect $200, and do not ever attempt to offer feedback to a UX designer ever again. something is deeply broken inside you that means there are certain mysteries of the universe that even the guys who designed the postgres command line can access that you will never know, and that's ok. You can still live a really rad life.

#SQL #dba #dataengineering

**⚯ Michel de Cryptadamus ⚯** @cryptadamist@universeodon.com · Jun 26 *

Jun 26 *

⚯ Michel de Cryptadamus ⚯ @cryptadamist@universeodon.com

scariest shit i've seen in years

**Lenin alevski** @alevsk@infosec.exchange · Jun 17

Jun 17

Lenin alevski @alevsk@infosec.exchange

New Open-Source Tool Spotlight

Transform any URL into an LLM-ready input with `Reader`. Just prefix the URL with `https://r.jina.ai/` for clean, readable content extraction. Perfect for enhancing agents & RAG pipelines. #LLM #NLP

Need web search results for your LLM? Prepend queries with `https://s.jina.ai/` to fetch top results—content included. E.g., `https://s.jina.ai/your+query` brings knowledge directly to your model. #AItools #DataEngineering

Reader API now supports images! Captions are auto-generated for images missing alt tags, giving LLMs better context for reasoning and summarizing multimedia pages. #MachineLearning #AI

Project link on #GitHub https://github.com/jina-ai/reader

#Infosec #Cybersecurity #Software #Technology #News #CTF #Cybersecuritycareer #hacking #redteam #blueteam #purpleteam #tips #opensource #cloudsecurity

—
P.S. Found this helpful? Tap Follow for more cybersecurity tips and insights! I share weekly content for professionals and people who want to get into cyber. Happy hacking

**dealingwith** @dealingwith@indieweb.social · Jun 10

Jun 10

dealingwith @dealingwith@indieweb.social

If anyone knows Data Engineers looking for work, this is our next hire: https://www.linkedin.com/posts/dealingwith_dataengineering-hiring-startuplife-activity-7338312558455476224-jvGh

https://billee.applytojob.com/apply/iTXqZOqOUu/Senior-Data-Engineer

www.linkedin.com#dataengineering #hiring #startuplife | Daniel Miller# Senior Data Engineer — Build From a Clean Slate Ever wish you could design a data platform without legacy hurdles? Here’s your chance. We’re looking for our first dedicated Senior Data Engineer to join forces with our Principal Engineer and lay the foundation from day one. Utility companies sit on massive billing datasets they struggle to interpret. At Billee Technologies, Inc. we are building AI-driven products to turn that noise into clear, actionable insights—and they need reliable infrastructure to do it. You’ll architect everything from data lakes and ELT pipelines to the models that fuel our AI. **What you’ll work with** - Python and SQL for core development - dbt for modeling and transformation - Orchestration tools like Airflow or Dagster to keep everything humming If turning open-ended problems into rock-solid data solutions sounds like your kind of challenge, let’s talk. https://lnkd.in/gfC-rZKc #DataEngineering #Hiring #StartupLife 🐐 🚀

#DataEngineering #hiring #getfedihired

**Mike Spencer** @mikerspencer@mastodon.scot · Jun 4

Jun 4

Mike Spencer @mikerspencer@mastodon.scot

A great job with a fantastic group: https://www.dataorchard.org.uk/analytics-engineer-vacancy

#DataScience #DataEngineering #RStats #JobFairy #FediHire @data_orchard

Data OrchardData Analytics Engineer Job Vacancy — Data OrchardData Orchard is recruiting for a Analytics Engineer to join our small but vibrant and exciting team.

**Seán Fobbe** @seanfobbe@fediscience.org · Jun 2

Jun 2

Seán Fobbe @seanfobbe@fediscience.org

New Slides

I've made the slides for my recent talk on "Legal Data Engineering" available #OpenAccess

Slides: https://zenodo.org/records/15575231 (in German)

#DataEngineering #Law

**Seán Fobbe** @seanfobbe@fediscience.org · Jun 2

Jun 2

Seán Fobbe @seanfobbe@fediscience.org

Slides zu Legal Data Engineering

Was ist Legal Data Engineering? Wie sieht die Praxis juristischer Daten in Deutschland aus? Welche rechtlichen Probleme ergeben sich im Zusammenhang mit Legal Data Engineering? Diese Präsentation bietet eine Einführung zu Legal Data Engineering und sucht Antworten auf diese Fragen.

Slides: https://zenodo.org/records/15575231/files/Fobbe_2025-05-28_Legal-Data-Engineering.pdf?download=1

Legal Data Engineering ist der Schwerpunkt eines jeden Legal Data Science Projekts. Kern von Data Engineering ist der ETL-Prozess: Extraktion, Transformation und das (Hoch-)Laden von Daten. Die Slides bieten dazu einen allgemeinverständlichen Überblick.

Weitere praktische Themen sind die Verfügbarkeit juristischer Daten in Deutschland (insbesondere strukturierter Daten und Programmierschnittstellen), Probleme bei der Tokenisierung in Large Language Models und die Fehlerkennung von Gen-Namen in Microsoft Excel.

Bei den rechtlichen Fragen des Legal Data Engineering behandle ich die tradierte Rechtslage, das neue Datennutzungsgesetz (DNG) und Bayern als Negativbeispiel einer verschlossenen juristischen Datenkultur. Eine Diskussion der Datenschutzklage gegen OpenJur und der Open Data-Klage der Gesellschaft für Freiheitsrechte (GFF) gegen die Bundespolizei klären über aktuelle Entwicklungen in diesem Rechtsbereich auf.

#DataEngineering #OpenAccess #OpenScience

**Seán Fobbe** @seanfobbe@fediscience.org · May 27

May 27

Seán Fobbe @seanfobbe@fediscience.org

Vortrag 28. Mai

Morgen am 28. Mai spreche ich um 19 Uhr online beim Legal Tech Lab Cologne über "Legal Data Engineering" - alle sind willkommen!

Wir sprechen über die Grundlagen von Legal Data Engineering (als Teilbereich von Legal Data Science), Legal Data Engineering in der Praxis und die rechtlichen Rahmenbedingungen von Legal Data in Deutschland.

Es wird auch Möglichkeit zum Austausch und Networking mit Gleichgesinnten geben.

Zugangsdaten: https://seanfobbe.com/de/posts/2025-05-13_vortrag-legal-data-engineerg-legal-tech-lab-cologne/

Seán Fobbe · May 13[28. Mai 2025] Vortrag zu Legal Data Engineering (Online)

Recent searches

Search options

Administered by:

Server stats:

#dataengineering