The types of learning.
Chaos, complexity, curiosity and database systems. A place where research meets industry
Monday, 30 April 2018
Machine Learning Algorithm Cheat Sheet
Another machine learning cheat sheet to help you choose your algorithm. The cheat sheet is designed for beginner data scientists and analysts.
The types of learning.
The types of learning.
Wednesday, 25 April 2018
GDPR
The General Data Protection Regulation (GDPR) comes into effect on 25 May 2018, one month from now. The EU General Data Protection Regulation is the most important change in data privacy regulation in 20 years. GDPR, is fundamentally about protecting and enabling the privacy rights of the individual.
A Guide to enhancing privacy and addressing GDPR requirements with the Microsoft SQL platform is an interesting read. The obligations related to controls and security around handling of personal data are some of the the concepts discussed in the the document.
GDPR Article 25—“Data protection by design and default”: Control exposure to personal
data.
• Control accessibility—who is accessing data and how.
• Minimize data being processed in terms of amount of data collected, extent of
processing, storage period, and accessibility.
• Include safeguards for control management integrated into processing.
GDPR Article 32—“Security of processing”: Security mechanisms to protect personal data.
• Employ pseudonymization and encryption.
• Restore availability and access in the event of an incident.
• Provide a process for regularly testing and assessing effectiveness of security
measures.
GDPR Article 33—“Notification of a personal data breach to the supervisory authority”:
Detect and notify of breach in a timely manner (72 hours).
• Detect breaches.
• Assess impact on and identification of personal data records concerned.
• Describe measures to address breach.
GDPR Article 30—“Records of processing activities”: Log and monitor operations.
• Maintain an audit record of processing activities on personal data.
• Monitor access to processing systems.
GDPR Article 35—“Data protection impact assessment”: Document risks and security
measures.
• Describe processing operations, including their necessity and proportionality.
• Assess risks associated with processing.
• Apply measures to address risks and protect personal data, and demonstrate
compliance with the GDPR.
Friday, 20 April 2018
DataWorks Summit 2018
This was the first time I had attended the DataWorks summit: Ideas. Insights. Innovation. for big data. I
had the privilege to attend the Luminaries dinner on arrival at the conference.
The dinner was held for the European data heroes award. The
Hortonworks data heroes initiative recognizes the data visionaries, data scientists, and data architects transforming their businesses and organizations
through Big Data.
Each day started with a set of keynotes.
Day
1 Opening Keynotes
The
Single Most Important Formula for Business Success Scott Gnau - Hortonworks
Changing
the Data Game with Open Metadata and Governance Mandy Chessell - IBM
Big
Data Success In Practice: The Biggest Mistakes To Avoid Across The Top 5
Business Use Cases Bernard Marr - Bernard Marr & Co.
Munich
Re: Driving a Big Data Transformation Andreas Kohlmaier - Munich Re
Scott
Gnau opened his talk with an hypothesis “Data is your cloud is your business”
Connecting disparate data to provide for real time information enables us to
innovated fast. A data strategy is imperative, it needs to include governance,
security and adopt rapid change. Data drives our lives everyday from smart edge
devices to all businesses.
He concluded with your data strategy is your cloud strategy is your business strategy if (A)
=(B) and (B) = (C) then (A) =(C).
Bernard
Marr then shared his insights about AI automating more things faster and the fourth industrial
revolution. He mentioned the top 5
business use cases as
- Informing: to make better decisions
- Understand: know you customers better
- Improvement: customer value proposition
- Automation: key business processes
- Monetization: data as an asset
A
couple of interesting points raised were about specialist data hunting units to find new data
sources and automation requirements to improve operations. Data diversity is key to improve analytics along
with data governance.
Day 2 Keynotes
Renault: A Data Lake Journey Kamelia Benchekroun - Renault Group
Are You Ready For GDPR? Jamie Engesser - Hortonworks, Srikanth Venkat - Hortonworks Inc
Embracing GDPR to Improve Your Business Practices in the Digital Age Enza Iannopollo - Forrester Research
Driving High Impact Business Outcomes from Artificial Intelligence Frank Saeuberlich – Teradata
Day
2 Forester Enza Iannopollo discussed embracing GDPR to improve your business
practices in the digital age. Privacy by design and by default requires new
business processes to be established and cultural change to happen. GDPR requires compliance across the organization and with external partners. The compliance strategies are
only as good as your risk assessment and mitigation. The classification of data is
a key place to start. Concluding the sessions with a quote
“Good
Data protection normally enables you to do more things with data, not less” Tim
Gough Head of Data Protection Guardian News and Media
Saturday, 14 April 2018
SQL Information Protection with Data Discovery and Classification
The public preview of SQL Information Protection brings advanced capabilities built into Azure SQL Database for discovering, classifying, labeling, and protecting the sensitive data in your databases. SQL Data Discovery and Classification are also added to SQL Server Management Studio.
This tools will help meet data privacy standards and regulatory compliance requirements, such as GDPR. It will enable data-centric security scenarios, such as monitoring (auditing) and alerting on anomalous access to sensitive data to be viewed in dashboards. It will help with controlling access to and hardening the security of databases containing highly sensitive data.
The SQL Information Protection (SQL IP) introduces a set of advanced services and new SQL capabilities, forming a new information protection paradigm in SQL aimed at protecting the data. The four areas covered are:
This tools will help meet data privacy standards and regulatory compliance requirements, such as GDPR. It will enable data-centric security scenarios, such as monitoring (auditing) and alerting on anomalous access to sensitive data to be viewed in dashboards. It will help with controlling access to and hardening the security of databases containing highly sensitive data.
The SQL Information Protection (SQL IP) introduces a set of advanced services and new SQL capabilities, forming a new information protection paradigm in SQL aimed at protecting the data. The four areas covered are:
- Discovery and recommendations
- Labeling
- Monitoring/Auditing (Azure SQL Db only)
- Visibility
Ph.D Graduation
“A story has no beginning or end: arbitrarily one chooses that moment of experience from which to look back or from which to look ahead.”
― Graham Greene, The End of the Affair
After 7 years of hard work, bringing industry and research together, I was excited to attend my Ph.D graduation. What an awesome and humbling day. Words can't express how it felt as a Ph.D graduate, with a Doctor of Philosophy, to sit on the stage along side the university academic staff. It is something I will never forget.
Now it is time to utilize my research skills gained throughout the Ph.D and begin something new. My aspirations in the academic field, are to write many papers, share my research findings and to become a research fellow.
― Graham Greene, The End of the Affair
After 7 years of hard work, bringing industry and research together, I was excited to attend my Ph.D graduation. What an awesome and humbling day. Words can't express how it felt as a Ph.D graduate, with a Doctor of Philosophy, to sit on the stage along side the university academic staff. It is something I will never forget.
Now it is time to utilize my research skills gained throughout the Ph.D and begin something new. My aspirations in the academic field, are to write many papers, share my research findings and to become a research fellow.
Thursday, 12 April 2018
Microsoft Professional Program for Artificial Intelligence
With Artificial Intelligence (AI) defining the next generation this Microsoft course seems a great way to jump start your skills.
The course covers these modules
The course covers these modules
- Introduction to AI
- Use Python to Work with Data
- Use Math and Statistics Techniques
- Consider Ethics for AI
- Plan and Conduct a Data Study
- Build Machine Learning Models
- Build Deep Learning Models
- Build Reinforcement Learning Models
- Develop Applied AI Solutions
- Final Project
At the end you gain the Microsoft Professional Program Certificate in Artificial Intelligence.
Tuesday, 10 April 2018
Leverage data for building
The leverage data to build intelligent apps presentation gives an insightful overview of the Microsoft Data Platform and how to innovate with analytics and AI.
Monday, 9 April 2018
Advice and guidance on becoming a speaker or volunteer
I watched this great session giving 'advice and guidance on becoming a speaker or volunteer' from SQLBits this year.
I felt humbled when I listened to the SQLBits session recording as I am named as an absolute legend for attending all 16 SQLBits and helping for over 8 years. I had never spoken, never presented or been involved in the public facing side of the conference. It is such a great feeling helping the conference be successful, helping others enjoy what working with data brings and being a part of the sqlfamily. Thanks to SQLBits for enabling me to be a part of such an amazing event for all of these years.
I felt humbled when I listened to the SQLBits session recording as I am named as an absolute legend for attending all 16 SQLBits and helping for over 8 years. I had never spoken, never presented or been involved in the public facing side of the conference. It is such a great feeling helping the conference be successful, helping others enjoy what working with data brings and being a part of the sqlfamily. Thanks to SQLBits for enabling me to be a part of such an amazing event for all of these years.
The PhD Bookshelf
Following on from the creation of a literature map for my PhD, I started to formulate a plan of literature to read. These are some of the books on my bookshelf.
I also read many academic papers, stored in seven box files and in Mendeley.
Mendeley is a free reference manager. It enables you to manage your research, showcase your work, connect and collaborate with over six million researchers worldwide.
I found the Communications of the ACM journal and SIGMOD, the ACM Special Interest Group on Management of Data journal great reads.
I also read many academic papers, stored in seven box files and in Mendeley.
Mendeley is a free reference manager. It enables you to manage your research, showcase your work, connect and collaborate with over six million researchers worldwide.
I found the Communications of the ACM journal and SIGMOD, the ACM Special Interest Group on Management of Data journal great reads.
Friday, 6 April 2018
Demystify complex relationships with SQL Server 2017 and graph
This great infographic shows some quick tips about SQL Server 2017 and graph databases. You can view this at: http://msft.social/jCIE18 . The picture demonstrates nodes and edges and provides a clear example of the code changes between the Traditional SQL query and the Graph query.
Tuesday, 3 April 2018
Cosmos DB SQL query cheat sheet
The new Azure Cosmos DB: SQL Query Cheat Sheet helps you write queries for SQL API data by displaying common database queries, keywords, built-in functions, and operators in an easy to print PDF reference sheet. Reference information for the MongoDB API, Table API, and Gremlin/Graph API are also included.
Sunday, 1 April 2018
Literature Map
When you start any research project, you need to set the
research in the context of the current literature. This will establish a framework
for the importance of the study. This document was the starting place for organizing
the literature of interest in my research.
Thesis
Title: A Study in Best Practices and Procedures for the Management
of Database Systems