Journey To Cloud

Introduction to IBM Cloud Hyper Protect Services

Overview – The global average cost of a data breach is $3.92M and this increases significantly for industries with regulatory concerns – for costs for data breaches in healthcare could go up to e.g. 6.45 million The loss of customer trust had serious financial consequences for the companies studied, and lost business was the largest of four major cost categories that contributed to the total cost of a data breach.. Since 2014, the share of breaches caused by malicious attacks surged by 21 percent, growing from 42 percent of breaches in 2014 to 51 percent of breaches in 2019.

Protect highly sensitive, confidential data and workloads on IBM Cloud

Problem – All the regulations either mandate encryption of data at rest and in transit as required or strongly encourage encryption as a technical measure to protect data.

Solution – Protect highly sensitive, confidential data and workloads on IBM Cloud with – IBM Cloud Hyper Protect Services

S.no. Threat Vectors Solution
1 Remote Attack Virtual Servers have strict control over SSH access
2 Privilege Escalation Peer Isolation prevents breakout exploits
3 Insider Attack Restricted memory dump access with no client data being extracted; all data-at-rest is encrypted
4 Image Tampering Only signed images are deployed via Secure Build

Threat attack Vector illustration in a Table.

FAQ’s

1. Do you need to be a Data Base Administration (DBA) with specialized skills to manage IBM Hyper Protect DBaaS?
– False

2. What is the name of the technology that provides a lot of the security capabilities for IBM Cloud Hyper Protect Services?
– Secure Service Container

3. IBM Cloud Hyper Protect Crypto Services is a ______________________________.
– Key Management and Cloud Hardware Security Module (HSM) Service

4. Why is it recommended that a Systems Seller partner with a Cloud Seller to complete the Software Quote and Order (SQO) Tool order process for a Cloud Subscription?
– All of the above

5. Why can’t insider threats compromise sensitive data?
– Because all data at rest is encrypted

6. With IBM Cloud Hyper Protect Virtual Servers, how can I ensure that I have complete authority over my created Virtual Server instance?
– The offering requires that a user provides their own public SSH key, meaning that IBM has no access

7. What are the 4 Offerings within Hyper Protect Services?
– DBaaS, Crypto Services, Virtual Servers, Containers

8. Operational Assurance means that IBM cannot access your data.
– False

9. How does IBM Hyper Protect Virtual Servers provide security without sacrificing on performance?
– Each virtual server instance is deployed as a highly available cluster with Multi Zone Region (MZR) support

10. Hyper Protect Services are primarily designed to help you protect which of the following?
– Mission Critical Workloads with Sensitive Data

11. What is the most common way for Systems and Cloud Sellers to sell Hyper Protect Services to their clients?
– An IBM Cloud Subscription through SQD DSW

12. Hyper Protect Crypto Services is a multi-tenant service.
– False

13. What is the name of the capability that prevents malware from being injected into production?
– Secure Build

14. With IBM Hyper Protect DBaaS, how can I ensure that no-one has access to my data?
– Because the service is built on the security and resilience of LinuxONE, there is built in tamper-protection. Data is isolated and encrypted at rest and in-flight. No one not even IBM admin can access your data

15. Hyper Protect Services offers unmatched security on the IBM LinuxONE with Technical Assurance that no one can see your data.
– True

16. What does research suggest is the leading inhibitor for Public Cloud adoption?
– Security, privacy, and regulatory concerns

17. Since Hyper Protect DBaaS provides industry leading confidentiality and security without sacrifice to performance. Choose the best two answers to support this statement
You can expect 99.99% SLA with Hyper Protect DBaaS
– Hyper Protect DBaaS is provisioned in 3 zones (1 primary and 2 secondary) providing you the highest level of security in the industry without sacrifice to performance

18. How many attack vectors are covered in “How to Safeguard with Hyper Protect Services” Session?
– 4

19. What are the types of databases supported by Hyper Protect DBaaS? Choose the 2 best Answers.
– PostgreSQL, MongoDB EE

20. What is the top inhibitor for enterprises to move workloads to the cloud?
– Sensitive data being uploaded to the cloud

21. What are the most common impacts of a breach?
– All of the above

22. What is it called when an external threat tries to hack into a Virtual Server?
– Remote Attack

23. IBM Cloud Hyper Protect Virtual Servers prevents external attacks, but there is a chance that internal attacks could occur and your data can be compromised.
– False

24. Hyper Protect Crypto Services offers highest level of security in the industry and is ______________.
– Built on FIPS 140-2 Level 4 certified hardware

IBM Storage Cyber Resiliency and Modern Data Protection Level 1

In today’s world, data protection needs to involve more than the traditional concepts of backup, disaster recovery, and business continuity. Instead, data protection needs to also include more modern concepts such as data reuse, security, air-gap, multicloud, and cyber-resilience. Let us explore these new areas to see how they can help clients to maximize business uptime, lower storage and operational costs, and support cyber resiliency.

There are a variety of reasons we care about protecting their environment. The five most common are:

Hardware failure of networks, power, servers, or storage. Systems don’t last forever. In a data center, these kinds of interruptions are very commonplace.

Invalid or incorrect data, or more precisely, corrupted data. Errors can easily occur during the reading, writing, storage, transportation, or processing of data.

Human error. Just like systems, humans aren’t perfect either. As such, data can be lost or made insecure purely by accident.

Cyber attacks. These are now part of the “new” normal. Nearly every week, we hear about another high-profile company that has been adversely affected by a new cyber attack.

Natural disasters. To see why this is an issue, simply imagine what would happen if a hurricane or earthquake destroyed the data center where your bank balance information was held, including backups too.

Commonly used terminology

Air gap
Physical isolation of systems or networks to avoid widespread corruption of data due to malware infection, system failures, or human error.

Backup
A copy of computer data taken and stored elsewhere so that it may be used to restore
the original after a data loss event.

Data breach
An incident that exposes confidential or protected information.

Encryption
An algorithmic technique that takes a file and changes its contents into something
unreadable to those outside the chain of communication.

Intrusion detection system
A type of security software designed to automatically alert administrators when someone
or something is trying to compromise an information system through malicious activities
or through security policy violations.

Malware
An umbrella term that describes all forms of malicious software designed to cause havoc
on a computer. Typical forms include viruses, trojans, worms and ransomware.

Ransomware
A form of malware that deliberately prevents you from accessing files on your computer.
If a computer is infected by malware designed for this purpose, it will typically encrypt files
and request that a ransom be paid in order to have them decrypted.

Recovery Point Objective (RPO)
The amount of time between data protection events. This reflects the amount of data that could potentially be lost during a disaster recovery.

Recovery Time Objective (RTO)
The targeted duration of time and a service level within which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity.

Vulnerabilities
A weakness which can be exploited by a threat actor, such as an attacker, to perform unauthorized actions within a computer system.

Write-once read-many (WORM)
A characteristic of the media where written user data cannot be modified or erased. Destruction of the data requires destruction of the physical media and its usability (for instance via intense heat, crushing, physical shredding, or intense magnetic fields)

FAQ’s

1. What is NOT a capability of IBM Tape?
– Fastest access to the first relevant byte of data, compared to disk and flash

2. Which of the following vendors is considered more of an incumbent than a newcomer?
– Commvault

3. Which offering is most often used for traditional backups?
– IBM Spectrum Protect

4. Which of the following is NOT an appropriate response to IBM’s position on the most recent Forrester Wave for Data Resiliency Solutions?
– IBM is proud to be the data resiliency leader on the Forrester Wave.

5. What was the main benefit stated in the Spectrum Protect Plus reference by the State of Ohio’s Department of Administrative Services?
– The client was able to restore a 2.5TB VM in less than three minutes

6. What are the two types of tape format solutions that IBM offers?
– IBM offers Linear-Tape Open and Enterprise tape solutions

7. What are the four key aspects of IBM’s modern data protection platform vision?
– Scale up/out, cyber resiliency, Docker/Kubernetes support, AI-infused protection

8. Which of the following is the most competitive backup product in the virtual server environment?
– Veeam

9. Which of the following is NOT an evolutionary trend of data protection?
– Moving all tape to cloud

10. On average, how long does it take to discover a data breach has occurred?
– 206 days, or about 7 months

11. What is a Recovery Point Objective (RPO)?
– The maximum allowable amount of data that is not backed up and is thus acceptable to lose due to a disaster

12. Which of the following is most commonly encountered as a competitor to IBM Spectrum Protect?
– Veritas

13. What technology provides isolation of networks to avoid widespread corruption of data due to malware infection?
– Air Gap

14. Which offering catalogs files and other IT objects to make them easily searchable, while also providing the ability to distinguish where data copies are located in a client’s environment?
– IBM Spectrum Copy Data Management

15. What does WORM stand for?
– Write Once Read Many

16. Which of the following are NOT a typical licensing structure that is used for modern data protection?
– Per backup

17. When speaking to a prospective client, which of the following is NOT an appropriate goal on a first call?
– Prove that we have the best offering specifications

18. What are the three primary IBM offerings for modern data protection?
– IBM Spectrum Protect, IBM Spectrum Protect Plus, and IBM Spectrum Copy Data Management

19. Which of the following is NOT a factor that drives the need for data protection?
– Improved rack cooling

20. Which offering should is best positioned to protect virtual, cloud and containerized environments?
– IBM Spectrum Protect Plus

21. Which software provides a Recovery Time Objective (RTO) within minutes for restoring snapshots?
– Spectrum Protect Plus

22. Which is NOT a core business value of IBM’s Data Protection portfolio?
– Data refreshing

23. What is a Recovery Time Objective (RTO)?
– Targeted duration of time within which a business process must be restored after a disaster in order to avoid an unacceptable break in business continuity

24. Which of the following acronyms is a characteristic of media where written data cannot be modified or erased, except by physically destroying the media through extreme means?
– WORM

SAS on IBM Power Systems

SAS Institute is a private company with more than 83,000 SAS customer sites in 147 countries

There is a strong customer adoption of SAS Viya as customers modernize their analytics environments

A total of $1B Investment is done by SAS in AI over 3 years
IBM and SAS have been partners since the founding of SAS. IBM Power Systems has been a critical part of that partnership since the 90s and it continues with the latest generation of Power Systems – IBM POWER9. IBM POWER9 based systems are designed from ground up to address the key needs of SAS workloads – massive I/O and memory bandwidth, accelerators for ML/DL/AI training and inference, and flexibility to run SAS workloads in the right cloud platforms.

SAS is positioning their next generation in-memory analytics framework, SAS Viya, as their “Evolve Your Analytics Platform”. This is an open and cloud-enabled, ML/DL/AI workload. There are a number of products that take advantage of the SAS Viya framework. The most frequently used products are Visual Analytics, Visual Statistics, Visual Data Mining and Machine Learning. In addition to supporting SAS’ proprietary SAS coding language, Viya user can also take advantage of open-source languages including Python, R, and Lua. Viya is mainly a Linux based workload and is specifically a Red Hat workload on IBM Power Systems.

Key Industries: Banking, manufacturing, retail, life sciences, insurance and education

FAQ’s

SAS Institute is ranked by analysts as a leader in:
AI-Based text analytics, Machine Learning, Fraud Detection, Analytics, Customer Analytics, Enterprise Insights Platform, Risk Management, Big Data, Data Integration, and Data Quality

1. What is a fact about the technology relationship between IBM and SAS Institute?
– IBM and SAS have more than 40 years of technology relationship and IBM is SAS’s longest-standing alliance partner.

2. What is a key characteristic of SAS Viya, SAS’ next-generation analytics framework?
– It accelerates analytics with Machine Learning/Deep Learning and Artificial Intelligence (AI).

3. SAS Viya is an analytics framework that includes machine learning, deep learning, and artificial intelligence workloads. Which key products are available to run in SAS Viya?
– Visual Analytics, Visual Statistics, and Visual Data Mining and Machine Learning

4. IBM’s message with SAS Viya is to sell an agile, full-stack solution.  What products make up a robust solution for a SAS Viya client?
– IBM POWER9 servers running Linux with IBM Storage.

5. How do IBM systems with POWER9 processors empower Advanced Analytics?
– IBM Power Systems provide full-stack infrastructure solutions to empower SAS analytics with industry-leading performance, flexible deployment options and maximum resilience.

6. Which of the following operating systems does SAS Viya run on?
– Linux and Windows

7. What are key challenges of using SAS on commodity hardware, that can result in longer SAS jobs, inefficient SAS processes and increased risk to the business?
– Underpowered servers, I/O bottlenecks, unplanned downtime, and security vulnerabilities.

8. How does SAS Viya relates and interacts with traditional SAS 9.4?
– SAS Viya can coexist with a traditional SAS 9.4. SAS Viya can be seen as a workload to augment traditional SAS 9.4 and Grid deployments.

9. Which IBM Storage family for SAS deployments delivers 99.9999% uptime and is able to take advantage of HyperSwap to obtain ever higher uptime?
– IBM FlashSystem, regardless of version.

10. Which IBM Storage products for SAS deployments feature NVMe performance?
– All IBM FlashSystem models

11. A client wants to deploy SAS Visual Data Mining and Machine Learning (VDMML). Which IBM server has unique capabilities suitable for this workload?
– The IBM Power Systems AC922, because it supports Nvidia GPUs and has NVLink 2, which accelerates GPU-to-CPU data transfer. This is the only system with this capability and the value will be reflected in the response times in VDMML.

12. A test comparing SAS 9.4 on IBM Power Systems with POWER9 and IBM all-flash FlashSystem storage with IBM Power Systems on non-flash storage revealed that:
– All key performance indicators were significantly better. In particular, both read and write throughput improved resulting in SAS real-time being better (i.e. lower): nearly cut in half.

13. What is a key change in SAS Viya compared to traditional SAS?
– New modern coding languages are supported: Python, R Java and Lua.

14. How does IBM Power Systems help reduce risk for mission-critical SAS deployments?
– IBM Power Systems have the highest server reliability, has proven high performance, flexibility through advanced virtualization and capacity on-demand. Furthermore IBM can provide the full-stack infrastructure for SAS (Compute, Storage, Network).

15. What is a good way to conclusively assess the reliability of IBM Power Systems servers compared to its competition?
– Peruse the most recent annual ITIC Global Server Hardware, Server OS Reliability Survey.

16. When on IBM Power Systems, which operating system would SAS Viya be deployed on?
– Linux

17. Centered on CAS (cloud analytics server) Viya speeds the time it takes to garner valuable insights from their data. Which of the following is NOT an advantage of the CAS architecture?
All actions are inherently parallel and use a single node.

18. Which of the deployment options support KVM and CPUs?
– Single-server on Accelerated server and Multi-servers on Accelerated servers.

19. Which of the following IBM infrastructure components would NOT be a good choice for a SAS 9.4 analytics infrastructure solution?
– IBM Power Systems with POWER9 processors

20. SAS workloads bring data from numerous sources and clients must deploy hardware that is able to keep up with the demands of the workload. Much of the work that a SAS programmer is performing is ETL (extract, transform, load) with some amount of advanced analytics. What are root challenges with commodity hardware?
– I/O bottlenecks, Underpowered SAS servers, Security vulnerabilities, Unplanned Downtime

21. What are the typical deployment modes for SAS Viya?
– SMP modes and MPP modes (Mixed Environment or not). Each mode has scale-out and scale-up options

22. Which models of POWER9 servers can be used to deploy SAS Viya?
– SAS Viya is optimized across the entire Power System based portfolio and can co-locate with SAS 9.4.

23. Who is tasked to make recommendations for hardware sizing for SAS workloads?
– The SAS EEC (Enterprise Excellence Center) team

24. How are SAS licenses acquired by their end-user?
– SAS on Power is a ‘bring your own license’ (BYOL) model, so the end-user procures licenses directly from a SAS authorized channel.

IBM Z Software Foundation

IBM Z Software Foundation – Key Links

Lesson 1
• For more information on zArchitecture: https://ibm.box.com/s/cb2z9ie5ozubnolhhd39k8co5juxwabk

• Why LinuxONE makes technical and financial sense: https://ibm.box.com/s/g8ijfius7cw8pndqa4icpxv0hopcdgxp

• IBM LinuxONE Cost Savings Estimator demo: https://ibm.ent.box.com/s/14s3pubez6yo90uapnnmb5pc3jxfarbc

o A full set of the Linux supported software for the different distributions can be found online.

o IBM Software Supported on RHEL 7 on LinuxONE at https://www.ibm.com/software/reports/compatibility/clarity-reports/report/html/productsOnOs?osId=1394485402300

o IBM Software Supported on SLES 15 on LinuxONE at https://www.ibm.com/software/reports/compatibility/clarity-reports/report/html/productsOnOs?osId=51F3735094B911E8A5E6A380334DFF95

o IBM Software Supported on Ubuntu 18.04 LTS on LinuxONE at https://www.ibm.com/software/reports/compatibility/clarity-reports/report/html/productsOnOs?osId=174FD630C56F11E88857DCC2171712A1
o Open Source on IBM LinuxONE at https://www.ibm.com/it-infrastructure/linuxone/capabilities/open-source

• IBM Z Software: https://www.ibm.com/it-infrastructure/z/software

• DB2 – https://www.ibm.com/analytics/db2

• CICS – https://www.ibm.com/it-infrastructure/z/cics

• Link to CICS video; https://www.youtube.com/watch?v=wAP_kwo8MKk&feature=youtu.be

• IMS – https://www.ibm.com/it-infrastructure/z/ims

• Link to IMS video: https://www.youtube.com/watch?v=2-lhvpdrcz4&feature=youtu.be

• WebSphere Application Server – https://www.ibm.com/cloud/websphere-application-server

Lesson 2

• Cobol – https://www.ibm.com/us-en/marketplace/ibm-cobol-compiler-family

• PL/1 – https://www.ibm.com/us-en/marketplace/ibm-pli-compiler-family

• DevOps

o zDevOps technologies – https://www.ibm.com/it-infrastructure/z/capabilities/enterprise-devops
o DevOps for Dummies – https://www.ibm.com/account/reg/signup?formid=urx-15681
o Link to DevOps video: https://www.youtube.com/watch?v=8EeyzV5o5jw&feature=youtu.be

• MQ – https://www.ibm.com/products/mq

• z/OS Connect Enterprise Edition – https://www.ibm.com/us-en/marketplace/connect-enterprise-edition

• Mainframe Dev as a Resource – https://developer.ibm.com/mainframe

• More details on APIs on Z

o Hybrid Cloud on IBM z: https://www.ibm.com/it-infrastructure/z/capabilities/hybrid-cloud
o Knowledge Center: https://www.ibm.com/support/knowledgecenter/en/SS4SVW_beta/welcome/WelcomePage.html

Lesson 3

• Learn how to achieve Operational Excellence – https://www.ibm.com/it-infrastructure/z/capabilities/it-operations-management

• Keep updated with IBM Z ITSM Newsletter – http://www.pages02.net/ibm-zsystemsservices/itsm/subscribe/

• Zowe

o General information – https://www.zowe.org/
o Master the mainframe – Zowe overview – https://mediacenter.ibm.com/media/1_sgxr9542?mhsrc

• IBM Mainframe Security – https://www.ibm.com/security/mainframe-security

• IBM Security – https://www.ibm.com/security

• Key Digital Trust solutions

o Resource Access Control Facility – RACF – https://www.ibm.com/us-en/marketplace/resource-access-control-facility-racf
o IBM Security Identity Governance and Intelligence – https://www.ibm.com/us-en/marketplace/identity-governance-and-intelligence
o IBM Multi-Factor Authentication for z/OS – https://www.ibm.com/us-en/marketplace/ibm-multifactor-authentication-for-zos
o IBM Security Key Lifecycle Manager for z/OS – https://www.ibm.com/us-en/marketplace/security-key-lifecyle-manager-for-zos
o IBM Security Guardium – https://www.ibm.com/security/data-security/guardium

• Key Threat Management solutions:

o IBM QRadar Security Information and Event Management – https://www.ibm.com/us-en/marketplace/ibm-qradar-siem
o IBM Security zSecure suite – https://www.ibm.com/security/mainframe-security/zsecure

FAQ’s

1. It is correct to say about Threat Management:
– This part of our portfolio is battle-tested, award-winning, and innovative
Stops threats with teams that can run your clients Security Operation Center and respond at a moment’s notice with some of the most powerful tools in the business

2. Who is NOT a typical IBM Z client?
– Is responsible for non-business critical data and transactions

3. Select the correct options that described DevOps
– DevOps is the union of People, Process, and Tools to enable continuous integration and continuous delivery
– It is short for Development and Operations
– It has the capabilities necessary to enable your customer to begin their application development and delivery transformation on IBM Z

4. Successful organizations will see APIs not just as technical tools, but as sources of strategic value in today’s digital economy. Most of the new key programming languages are designed to fit nicely into the API economy for the enterprise.
– True

5. IBM Z Operating System, or z/OS has been around in one form or another since the 1960s, but applications developed back then can NOT run on the latest version of z/OS anymore.
– False

6. Digital Trust solutions deliver:
– The help that customers need to build a scalable identity and digital trust program that secures business
– Endpoint management capabilities needed for security professionals to keep up with digital transformation
– Data protection, application security, identity and fraud management, and cloud and mobile security

7. Select the option that is NOT correct:
– IBM does not offer new version of compilers

8. Select how does Db2 extend the value delivered to client enterprises:
– Enables easy access, scale, and application development for the mobile enterprise
– Continues to be the industry gold standard for availability, reliability, and security for business-critical information
– Delivers business insights faster while helping to reduce costs

9. Digital transformation has placed new burdens on IT operations and management. IBM’s view is that these challenges can be addressed by embracing the following 3 paradigm shifts:
– Enable change without compromising on reliability
– Manage visibility into hybrid and multi-cloud applications
– Empower operations with intuitive user experiences and analytics

10. The purpose of the Middleware is:
– Insulate the application from the physical environment, increasing the programmer’s productivity

11. Enterprises need continuous software delivery and management. This enables organizations to innovate more rapidly to capitalize on new market opportunities. To reduce the development cycle times is normally not a worry for most of the organizations.
– False

12. MQ provides a ‘safe place’ for messages between applications. Select the options that are true related to MQ:
– Applications or servers being down doesn’t mean that messages are lost. Offers asynchronous messaging
– Delivers message once. To deliver message once and once only delivery is hugely important
– MQ enables completely different applications and systems to work together

13. What are the key Digital Trust solutions for IBM Z?
– Resource Access Control Facility – RACF, IBM Security Identity Governance and Intelligence, IBM Multi-Factor Authentication for z/OS, IBM Security Key Lifecycle Manager for z/OS, IBM Security Guardium

14. DevOps approach does NOT offer the following:
– Stops threats and respond at a moment’s notice with some of the most powerful tools in the businessb

15. What is the option that is NOT a critical integrated area that IBM Security strategy focus on?
– Password protection

16. IBM’s key middleware applications include:
– IBM Information Management System (IMS)
– WebSphere Application Server (WAS)
– CICS Transaction Server for z/OS

17. The benefits of IBM MQ include:
– Reduces cost of integration and accelerates time to deployment
Offers a single, robust, and trusted messaging backbone for dynamic heterogeneous environments
– Preserves message integrity and helps minimize risk of information loss

18. IBM MQ is the messaging middleware that simplifies and accelerates the integration of diverse applications and business data across multiple platforms. It uses message queues to facilitate the exchange of information between applications, systems, services, and files and simplifies the creation and maintenance of business applications.
– True

19. Integrating IBM Z into the hybrid cloud unleashes the potential and value of the enterprise data and applications.
– True

20. CICS Transaction Server provides the following services:
– A highly efficient and optimized environment for running transactions
– Access by applications to data stored in Db2 and IMS, among others
Interoperation with IBM MQ and access to the Message Queue Interface from CICS application programs

21. Linux on IBM Z enables customers to run the Linux operating system on IBM Z hardware. Select 3 key points about Linux, which is the fastest growing operating system in today’s business environment:
– Linux supports multiple hardware platforms, offering application portability
– Linux has an affinity with virtualization, and it is supported on all major hypervisors, from z/VM to VMware and Hyper-V
– Linux is developed by a highly skilled, open source community

22. Which of following security solutions protects resources by granting access only to authorized users of the protected resources, retains profile information about users and refers to these profiles when deciding who should be permitted access?
– Resource Access Control Facility – RACF

23. Select the correct statements about API
– An API is a software-to-software interface
– APIs allow users to update data and companies to leverage core business processes and business logic that has taken years to develop
– The mainframe APIs expose the data that is requested from other APIs that are in the cloud or in other mobile applications

24. What are the key Threat Management solutions for IBM Z?
– IBM QRadar Security Information and Event Management and IBM Security zSecure suite

25. Why should your clients also focus on securing IBM Z?
– Enterprise threats are becoming increasingly sophisticated and continue to profile and bypass traditional defenses
– Mainframe processes, procedures, and reports are often siloed from the rest of the organization
– The data that resides there is the ultimate target for security threats.

AI Fundamentals

Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) are three popular terms tech companies can’t stop talking about today, and for good reason: they represent a major step forward in how computers can learn. But, as this slide shows, these terms are not interchangeable. Instead, you can think of AI, ML, and DL as a set of Russian dolls nested within each other, with the largest, outermost doll being AI and the smallest, innermost doll being DL (leaving ML in the middle).

AI is the all-encompassing computer science concept that’s concerned with building smart machines that are capable of performing tasks that normally require human intelligence. Machine learning refers to a broad set of techniques that give computers the ability to “learn” by themselves, using existing data and one or more “training” algorithms, instead of having to be explicitly programmed. And, deep learning is a technique for implementing ML that relies on deep artificial neural networks to perform complex tasks such as image recognition, object detection, and natural language processing. (Neural networks are a set of algorithms – modeled loosely after the neural networks found in the human brain – that are designed to recognize hidden patterns in data.)

AI can be as simple as a set of IF-THEN programming statements or as complex as a statistical model that’s capable of mapping raw sensory data to symbolic categories. IF-THEN statements are nothing more than rules that have been explicitly coded by a human programmer; in fact, multiple IF-THEN statements used together are sometimes referred to as rules engines, expert systems, symbolic AI, or simply GOFAI – an anacronym for Good, Old-Fashioned AI. Because of this, some argue that when a computer program designed by AI researchers succeeds at doing something complex – such as beating a world champion at chess – this doesn’t really demonstrate AI, because the algorithm’s internals are well understood by the researchers; that true AI only refers to computer systems that have been designed to be cognitive, to be aware of context and nuance, and to make decisions that are the result of a reasoned analysis. And, to some extent, they’re correct.

There are really two types of AI: Strong AI (sometimes called True AI or General AI), and Weak AI (also known as Narrow AI or Modern AI). Strong AI is the type of AI that’s concerned with building smart machines that are capable of performing tasks that normally require human intelligence. This type of AI has been sensationalized by movies like “I, Robot “ and “The Matrix”. Weak AI, on the other hand, refers to technologies that rely on algorithms and programmatic responses to perform specific tasks – especially repeatable tasks – as well as, or better than human beings. And while we’re not anywhere close to being able to deliver Strong AI today, we can provide Weak AI and it is this type of AI that has garnered a significant amount of interest in recent years.

Machine learning is a subset of AI and one aspect that separates it from other types of AI (such as rules-based engines) is its ability to modify itself when exposed to more data. The “learning” part of ML means that algorithms attempt to optimize along a certain dimension; typically, to minimize error or maximize the likelihood their predictions will be correct.

Deep learning is a subset of ML and usually, when people use the term deep learning, they are referring to deep artificial neural networks. “Deep” is a technical term that refers to the number of layers that are used in a neural network. A shallow network will have one hidden layer while a deep network can contain hundreds. Multiple hidden layers allow deep neural networks to learn features of data in a so-called feature hierarchy, where simple features (for example, pixels) recombine from one layer to the next, to form more complex features (for instance, images). Neural networks with many layers pass data (features) through more mathematical operations than those with fewer layers, and therefore are more computationally intensive to train.

Prior to 2010, very little work was being done in DL because even shallow neural networks are extremely compute-intensive. It was only when three forces – the availability of immense stores of labeled data; the invention of new deep learning algorithms and activations functions; and cheaper, more powerful CPUs (central processing units) coupled with the intense performance of GPUs (graphics processing units designed to render images, animations, and video for a computer’s screen, which are exceptional at matrix-matrix multiplication, a cornerstone of deep learning) – converged that deep learning moved from research papers and laboratories to real world applications. Now, the way in which deep learning can be adopted and used has become a critical strategic consideration for every cognitive enterprise.

As was pointed out earlier, AI, ML, and DL are frequently misused, but generally, we can refer to the discipline for all three as a path to Cognitive Computing (CC) – which IBM is highly organized around. The best way to describe all of these disciplines, if you wanted to put them in layman’s terms, is that they are technologies that encompass the process of analyzing tons of data to find common patterns and turning those patterns into actionable predictions and insights.

It is important to understand that more is needed to get to cognitive computing than is presented here. For instance, governance and validation of the data used is critical. In fact, one of the biggest challenges AI faces today is ensuring that AI systems are not being taught with biased or incorrect, non-curated data! IBM has deep expertise and market leading practices and solutions in this area to support a customer’s cognitive computing journey.

FAQ’s

1. Which of the following correctly describes the result of the following SQL DELETE statement: “DELETE FROM EMPLOYEES”?
– All data rows within the EMPLOYEES table will be deleted.

2. What are the two primary workloads supported by a relational database?
– Transaction processing (operational) workloads and data warehouse and analytics workloads.

3. Which of the following is an advantage of a view?
– Can be used to remove sensitive columns from a table definition for query use and can be used to combine information from multiple tables to simplify queries.

4. What is the name for the underlying structure of a table index in a relational database?
– B-tree.

5. What is the primary workload that is ideal for a column-organized table?
– Data warehouse workload.

6. Which of the following correctly describes the result of the following SQL UPDATE statement: “UPDATE EMPLOYEES SET SALARY=100000 WHERE EMPLOYEE_ID = 1240”?
– The record in the EMPLOYEES table that has an EMPLOYEE_ID equal to 1240 will have the SALARY column changed to 100000.

7. Which of the following are examples of an operational (transaction processing) workload?
– Shipping logistics, airline reservations, payroll systems, manufacturing systems, and online retail systems.

8. What are the two methods of table access in a relational database that were discussed in the lesson?
– A full table scan or through the use of a table index.

9. What is the language that was developed for accessing data within a relational database?
– Structured Query Language (SQL)

10. Choose the word pair that best completes the following definition: A table within a relational database consists of ___ that describe a single ___.
– Attributes / Entity

11. What are the two types of tables that are available within relational databases today?
– Row-organized tables and Column-organized tables.

12. What does the acronym MPP mean with respect to relational database terminology?
– Massively Parallel Processing.

13. Which of the following statements is true about a foreign key on a database table?
– Ensures that a data row cannot be inserted into the table without the corresponding primary key already existing in the parent table.

14. What are the characteristics of an operational (transaction processing) workload?
– Number of concurrent transactions is large (100s to 1000s per hour).

15. Which of the following accurately define the ACID property within a relational database?
– Atomicity, Consistency, Isolation, and Durability.

16. What does the SELECT clause portion of a SQL SELECT statement contain?
– The column predicates (such as LAST_NAME = “Baker”) to restrict the data being returned by the SQL statement.

17. Which of the following is one of the design points of a relational database?
– Minimize the duplication of data in multiple tables within a database.

18. Which of the following is NOT a component of a relational table definition?
– Number of rows contained within the table.

19. Which of the following correctly defines a foreign key?
– A key (consisting of a column or group of columns) that is placed within a table to provide a reference point to another table or tables within a database.

20. Which of the following statements best describes what must be done prior to issuing SQL statements against a database?
– You must connect to the database prior to issuing SQL statements.

21. Which of the following is the correct definition of a composite index?
– An index that is defined on multiple columns within a table.

22. Which of the following is NOT a characteristic of a primary key on a relational database table?
– Ensures that the column name cannot be used within any other table in the database.

23. If an application consists of many, frequent updates (inserts, deletes, or updates) within a short time duration (seconds) where each transaction accesses a small number of rows from a large number of users, this is an example of what type of workload?
– Operational workload.

24. If an application consists of data selection from several tables and each query accesses a relatively large data set (1000s or more rows) across a small group of users, this is an example of what type of workload?
– Data warehouse and analytics workload.

25. Which of the following statements is true?
– A view on a table uses minimal storage within the database.

1 Comment:

Leave a Reply:

Your email address will not be published. Required fields are marked *