Sublime
An inspiration engine for ideas
Jacopo Tagliabue • Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie.
Built For Growth
Don't hack custom scripts or use half-baked tools. SQLMesh ensures accurate and efficient data pipelines with the most complete DataOps solution for transformation, testing, and collaboration.
SQLMesh
Let us briefly touch on the concepts of data science and data engineering. If we go back to the DIKW triangle, we can say that data science focuses on extracting knowledge and wisdom from the information we have. Data scientists combine tools from mathematics and statistics to analyze information to arrive at insights. Exponentially increasing amou
... See morePierre Pureur • Continuous Architecture in Practice: Software Architecture in the Age of Agility and DevOps (Addison-Wesley Signature Series (Vernon))
Shreya Shankar • "We Have No Idea How Models will Behave in Production until Production": How Engineers Operationalize Machine Learning.
The part of the system I'm most proud of, and on which I spent the most effort, is the ETL process.
We had a series of shell scripts for each data source we ingested (there were many), which would pull the data and put it in an s3 bucket.
Then, early in the morning, a cron job would spin up an EC2 instance, which would pull in the latest ETL code... See more
Bill Mill • notes.billmill.org
The part of the system I'm most proud of, and on which I spent the most effort, is the ETL process.
We had a series of shell scripts for each data source we ingested (there were many), which would pull the data and put it in an s3 bucket.
Then, early in the morning, a cron job would spin up an EC2 instance, which would pull in the latest ETL code... See more
Bill Mill • notes.billmill.org
The platform needs to facilitate integrating new data, ad hoc queries, and visualization to accelerate human understanding. As valuable insights emerge from this platform, they become the requirements for changes to production systems and processes.
Thomas H. Davenport • Big Data at Work: Dispelling the Myths, Uncovering the Opportunities
Jan-Erik Asplund • Earl Lee, co-founder and CEO of HeadsUp, on the modern data stack value chain
bound for the data warehouse and BI applications. ETL tools are central to first-rate BI environments. They are mature tools that reduce development time, manage the flow of data along the BI value chain, and provide the means to manage changes to data over time as transactional systems and enterprise applications evolve. One key component of an ET
... See more