Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Showing posts with label Ethics. Show all posts
Showing posts with label Ethics. Show all posts

Sunday, 23 May 2021

Ethics Self-Assessment Tool

 The ethics self assessment tool helps researchers use an ethics framework throughout their research.


This tool helps shapes discussions and highlights ethical issues. The questions it makes researchers ask is what should be done. Biases in AI research can cause harm or disproportionately weight outputs. Potential biases could come from data sources, methods employed and in the outputs and how the results are interpreted. The framework is here. 

Microsoft is investing in helping with understanding ethics in the business and research arena. The Microsoft ethical rules are based on 6 principles.  To get started with that holistic approach to AI and learning go to the AI Business School for Artificial Intelligence


The principles of responsible AI from Microsoft are

  • Fairness - should treat all people should be fairly
  • Reliability & Safety -  should perform reliably and safely
  • Privacy & Security - should be secure and respect privacy
  • Inclusiveness - should empower everyone and engage people
  • Transparency - should be understandable
  • Accountability - People should be accountable for AI systems 


  • Tuesday, 7 January 2020

    ODI Data Ethics Canvas

    The ODI have created a useful tool for anyone who collects, shares or uses data. This aims to help identify management ethical issues at the start of a data project. The Open Data Institute defines data ethics as:

    'A branch of ethics that evaluates data practices with the potential to adversely impact on people and society – in data collection, sharing and use'

    They say data ethics relates to good practice around how data is collected, used and shared. It is especially relevant when data activities have the potential to impact people and society, directly or indirectly.

    The Data Ethics Canvas is a part of a wider data toolkit.


    Thursday, 2 January 2020

    Asilomar AI Principles

    Data Ethics has been brought to the fore by AI algorithms showing bias. There are various insightful articles which discuss data ethics. The Asilomar Conference on Beneficial AI organized by the Future of Life Institute was held January 5-8 2017 at the Asilomar Conference Grounds in California. The conference aimed to address and formulate principles of beneficial AI. With more than 100 thought leaders and researches in economics, law, ethics and philosophy at the conference, it resulted in the creation of a set of guidelines for AI research. There are 23 Asilomar AI Principles of which many are related to ethics and values.





























    This is a significant enhancement on the Isaac Asimov's "Three Laws of Robotics" which were shared in his 1942 short story "Runaround". The Three Laws he listed were:

    • A robot may not injure a human being or, through inaction, allow a human being to come to harm.
    • A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
    • A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.

    In 2016 Satya Nadella did share a vision for more relevant AI rules.

    • AI must be designed to assist humanity.
    • AI must be transparent. 
    • AI must maximize efficiencies without destroying the dignity of people. 
    • AI must be designed for intelligent privacy. 
    • AI must have algorithmic accountability. 
    • AI must guard against bias. 

    From this it has led to data ethics becoming its own branch of ethics.

    Tuesday, 5 March 2019

    International Women's Day

    International Women's Day is fast approaching. It is celebrated on 8 March every year. It is a celebration of women globally. It is a chance to network, be inspired and share your stories to empower each other.
    The Official UN theme for 2019 is Think Equal, Build Smart, Innovate for Change. The theme will focus on innovative ways to advance gender equality and the empowerment of women, particularly in the areas of social protection systems, access to public services and sustainable infrastructure.











    I am excited to be contributing to a webinar for International Women's Day 2019. The webinar discussed women's roles in the field of technology.


    Saturday, 2 March 2019

    SQL Bits 2019 Keynote






















    What an amazing SQLBits in Manchester. Four days packed full of leading edge data technology covering

    • SQL Server 2019 Big Data
    • Azure SQL Managed Database
    • Power BI
    • Kubernetes
    • Machine Learning
    • Python
    • Spark

    This year SQLBits 2019 had a keynote.  It was nice for the event to have a keynote again. The theme Data Never Rests.  The Microsoft Data Platform Product group who spoke were Buck Woody, Bob Ward, Anna Thomas, Alain Dormehl, Adam Saxton and Patrick LeBlanc. An amazing set of speaks and fun keynote. They shared details of the evolution of the data platform to enable people to keep their skills up to date. The keynote is available to watch . There were several major announcements.


    SQL Server 2019 will RTM in second half of the year. SQL Server 2019 CTP2.3 is available now with
    • Big data cluster enhancements
    • Accelerated database recovery
    • Performance enhancements
    • Graph data enhancements
    • SSAS enhancements

    SQL Server 2019 is a modern innovation and there are various forms of the product.
    • On Premises
    • SQL Server Azure VM (IaaS)
    • Azure SQL DB Managed Instance (PaaS)
    • Azure SQL Data Warehouse


    Azure SQL Database Hyperscale can autoscale up to 100TB and scale compute and storage independently.
    During the keynote they showed Azure SQL Database Hyperscale where a 50TB database was restored in just under 8 minutes. That is nice accelerated database recovery.

    Data virtualization and big data clusters is a game changing view with SQL Server 2019 big data clusters, data lake scale, machine learning and AI. Multiple data sources can be connected using external table, through the compute pool using Polybase connectors at the source.  Data persistence using multiple data sources is stored in shards of the data pool for SQL Server 2019 big data clusters data mart.



    SQL 2019 will send push down predicated queries to other data platforms via Polybase to join SQL data with Oracle, Mongodb and CosmosDB data in one place efficiently.

    SQL notebooks in Azure Data Studio is an awesome new feature. 

    There is documentation to read and new courses for learning.

    aka.ms/DataAccessGuide

    and a Summary of All Exams and Certifications Launched in January, 2019!

    aka.ms/DataEngCerts







    Monday, 11 February 2019

    Data Trends for 2019


    I created a survey question on Twitter to look at data trends. I was interested to see whether people felt that improving the quality of their data was more important than AI data ethics. Data quality is heavily influenced by data ingest so I added this as an option, as i felt it is often over looked, but is a foundation stone of good data quality. 

    A few definitions:

    Data Ethics describe a code of behaviour, specifically what is right and wrong, encompassing the following: Data Handling: generation, recording, curation, processing, dissemination, sharing, and use."  

    Data Quality (DQ) as stated in the DAMA International, Data  Management Book of Knowledge  "Refers to both the characteristics associated with and to the processes used to measure or improve the quality of data.” Data is considered high quality to the degree it is fit for the purposes data consumers want to apply it."

    Data ingestion is the process of obtaining and importing data  for immediate use or storage in a database. To ingest something is to "take something in or absorb something." Data can be streamed in real time or ingested in batches.”

    Data ingestion tools provide a framework that allows companies to collect, import, load, transfer, integrate, and process data from a wide range of data sources.” 

    The survey question had 267 votes.

    What do you think will be the most important #Data trend for 2019 out of the following options?













    In additions to the results above I received a few additional comments. 
    • Neither
    • The biggest thing in my opinion is just ethics. How is the data collected?
    • Also, what is it being used for. What are the impacts of high or low accuracy models.
    • Improving quality and ethics seem to me, to be related tasks
    • All of the above?


    The results are quite interesting with AI Data Ethics and Improving Data Quality being the trends that the respondents thought were the most important.

    Monday, 11 July 2011

    Database Ethics

    This article provides a reflection on the ethics required by a database administrator (DBA). This philosophical code of behaviour sets out the principles for the profession. The DBA ethics storyboard is a visualization of the ethics and is shown in figure1.


    Figure 1 DBA Ethics Storyboard

    Ethic 1 DBA - the role of a DBA requires
    • a person who is a database advocate
    • being proactive rather than reactive
    • having numerous interconnected responsibilities
    • clear communication with all the interconnected technologists
    • making recommendations to the public
    • that if a mistake is made corrective action needs to be taken immediately
    • to exercise governance and agility

    Ethic 2  DATA
    • the data the organizations hold is the most precious asset it has. A DBA must protect it
    • the quality of the data should be maintained
    • all types of data whether structured or unstructured need looking after
    • data is retained and archived appropriately

    Ethic 3 SECURITY - a DBA is responsible for
    • making data available to authorized users and ensuring data it is inaccessible to the unauthorized user
    • identification of sensitive data, managing it securely and auditing access
    • patching and a DBA should have a patching philosophy to prevent security issues arising

    Ethic 4 ARCHITECTURE - a DBA  should
    • ensure good data modelling and design techniques are used 
    • use the right database application for the requirements
    • have an entity relationship diagram should for each database
    • have a master data management process to define and manage entities

    Ethic 5 DEVELOPMENT - a DBA should ensure
    • the development scripts written are version controlled
    • coding standards are followed
    • the scripts include a description of the tasks

    Ethic 6 AVAILABILITY- a DBA should
    • monitor the system to ensure it is always available when it is needed
    • regularly review error logs
    • manage scheduling and job success rate
    • ensure there is a capacity management process in place
    • ensure good performance is maintained

    Ethic 7 DISASTER RECOVERY - a DBA should ensure
    • backups are taken regularly and regularly verify the quality of the said backups
    • every database has a documented disaster recovery plan.
    • that a recovery time objective and a recovery point objective are defined
    • backups are stored offsite
    • a server configuration snapshot is stored offsite

    Ethic 8 CHANGE - a DBA should
    • monitor changes to the system
    • follow processes for management.
    • have a multiple tier environment to validate changes
    • have a risk assessment strategy
    • ensure rollback can occur for failed changes
    • identify the consequences of a change
    • have a verification process to determine if the change was successful

    Ethic 9 PROBLEMS 
    • have a process for problem management

    Ethic 10 DOCUMENT - a DBA should ensure
    • appropriate documentation is written or obtained
    • a summary of essential information such as configuration information be created
    • there is a process for keeping the documentation up to date and reviewed regularly
    • a self-documenting system is created where possible

    Ethic 11 AUDITING - a DBA should ensure there are regular database audits which
    • periodically audit each database and provide a health check
    • have a predefined checklist to follow
    • have a description of the purpose of each check

    Ethic 12  BEST PRACTICE- a DBA should
    • follow best practice for design, development and administration based on vendor recommendations, practical field experience, database usergroups,  and environmental constraints
    • aim for standardization and automation

    Ethic 13 IMPROVEMENT- a DBA should
    • reflect and iterate throughout the database system
    • have a documented database roadmap

    Thursday, 30 April 2009

    DBA Code of Ethics

    Databases are now being used across multiple industries and environments and there is the need to look not only at the ethics of the data contained within the databases but at the group of people who administer and are the guardians of the database. There is ethical concern over the information contained within the database how it is stored, accessed, secured and gathered however regardless of this the administrators need to ensure that they follow some ethical principals or guidelines. The guidelines should cover not only the overarching ethics but the core aspects which need to be covered whilst administering the database.

    Ethic 1 - The DBA role consists of
    • Being a champion of the database
    • Numerous stated and unstated responsibilities
    • Explaining the DBA role and recommendations to the public
    • The DBA hero is one who avoids any problems rather than a firefighter of issues
    Ethic 2 - The Company’s data is the most precious asset it has. A DBA must protect it.
    Ethic 3 - A DBA is responsible for making data available to authorized users and ensuring data inaccessible to the unauthorized user. Also Identification of sensitive data, managing it securely and auditing access is a key responsibility.
    Ethic 4 - A DBA should have a patching philosophy for when
    • Security issues arise
    • When software bugs and enhancements arise
    Ethic 5 - A DBA should monitor the system to ensure it is always available when it is needed
    Ethic 6 - A DBA should ensure backups are taken regularly and verify the quality of the said backup.
    Ethic 7- Ensuring every database has a documented disaster recovery point specified
    • What is the time to recover?
    • What is the recovery point interval?
    Ethic 8 - A DBA should monitor changes to the system
    • Following Information Technology Infrastructure Library (ITIL) best practice for IT Service Management.
    • Having multi tier environments to validate changes
    • Ensure rollback can occur for failed changes
    • Ensure development scripts are written and version controlled to change the environments
    • Identify the consequences of this change
    Ethic 9 - A DBA should ensure appropriate documentation is written or obtained
    • A summary of essential information such as configuration
    • A process for keeping it up-to-date is the crucial aspect.
    • Try to create Self-documenting systems where possible
    Ethic 10 - A DBA should ensure good Data Modelling is applied so the data is useable across the systems
    Ethic 11- Have an interactive process for problem management
    Ethic 12- Ensure there is a capacity management process in place
    Ethic 13 - A DBA should ensure there are regular database Audits which
    • Periodically audit each database
    • Have a Checklist for problems
    • Have a sheet describing the purpose of each check
    Ethic 14 - A DBA should design and follow best practice for design, development and administration
    Ethic 15 - When you’ve made a mistake admit it, quickly so corrective action can be taken immediately.