As a data specialist, expectations are high. Data scientists are perceived as essentially magicians, who can wrangle data, whip up an algorithm and pull a result out of their hat. On demand.
Like you, we know that it takes work, trial and error and a community to create and deliver quality analysis and to provide valuable business insights. This blog is designed to give you a summary of some of the best examples that appear in multiple journals. In short, if you need a tip for solving a business or data problem, check here first. We aren’t replacing the original articles. Rather, we are summarizing key points to help you zero in on where to find the details.
Let’s get started.
Author: Tirthajyoti Sarkar, ON Semiconductor
Source: https://towardsdatascience.com/activation-maps-for-deep-learning-models-in-a-few-lines-of-code-ed9ced1e8d21
How: Convolutional Neural Network (CNN); Detailed instructions and code are available via Jupyter notebook, A single function to streamline image classification with Keras and nice little library called Keract
When to use this: When needing to construct a deep learning model with an image dataset and need to test image(s) within that set
Why it’s helpful: As the author states in his related article, “we aim to write a single utility function, which can take just the name of your folder where training images are stored, and give you back a fully trained CNN model.” “You can train a CNN, generate activation maps, and display them layer by layer — from scratch.”
Suggested application: Visualizing unstructured data such as images, text or audio
Business impact or insights to be gained: Catch dead filters and high learning rates, as well as nuance predictions made using the activation maps
Author: Tirthajyoti Sarkar, ON Semiconductor
Source: https://jaxenter.com/data-science-synthetic-162053.html, https://towardsdatascience.com/synthetic-data-generation-a-must-have-skill-for-new-data-scientists-915896c0c1ae and https://www.snowflake.com/blog/synthetic-data-generation-at-scale-part-1/
How: Python library
When to use this: GDPR and when you need anonymized data
Why it’s helpful: helps you meet regulatory requirements as well as to run experiments with classification, regression and clustering algorithms
Suggested application: when there are privacy and security concerns, such as medical services or military
Business impact or insights to be gained: Run simulations, convert existing data that needs to be anonymized while retaining underlying statistics of the original data
Author: Alex Grizhnevich
Source: https://www.scnsoft.com/blog/how-to-monitor-machine-utilization-across-distributed-factories-with-iiot
How: via Ethernet port, via serial-to-Ethernet converter, via a machine’s PLC
When to use this: when looking to incorporate IoT into manufacturing, including with a legacy infrastructure
Why it’s helpful: improved efficiencies, quality measures and data expected by senior leadership of all parts of the business
Suggested application: distributed manufacturing organization with a mix of legacy and newer equipment, including some without cloud connectivity
Business impact or insights to be gained: combine IoT measurements with analysis of efficiencies, predicting when to upgrade equipment and to anticipate impact of potential shutdowns
Bonus article: Trying to explain Data Governance to leaders in your organization? Wondering where your company stacks up against others? This reference can help you with some stats, some definitions and some best practices to help make the argument in your place of work. https://bi-survey.com/data-governance
Subscribe to get more tips and references. Have an article you’d like featured? Send us a note at Contact Us.
You may not be ready for us now, but you’ll want to remember us when you are. Enter your email to stay updated on the latest in analytics and our services.