From 2e6dded4fca0b446f66aa2c539b59d03ac0bc182 Mon Sep 17 00:00:00 2001 From: Jakub Krzywda Date: Thu, 3 Jun 2021 10:47:22 +0200 Subject: [PATCH 01/18] Generalize the checklist for non Kubernetes environments --- README.md | 292 ++++++------------------------------------------------ 1 file changed, 31 insertions(+), 261 deletions(-) diff --git a/README.md b/README.md index 870a12a..b8f9942 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,9 @@ -## Kubernetes Information Security Review Checklist - +# Cloud Information Security Review Checklist ## Governance, risk management, and compliance 1. What regulations / information security standards do you need to comply with? - 1. ☐ FFFS 2014:7 2. ☐ SOC-2 3. ☐ PCI-DSS @@ -15,526 +13,298 @@ 7. ☐ GDPR 8. Other - 2. Who/what role in your organisation is responsible for compliance? - 3. Does your organization have general information security policies outside of your compliance requirements in place? (e.g., “We need to store data in Brazil to appeal to national sentiments.”) - ### Mapping of Standards to Policies What policy statements do you have in place to show your compliance with the above standard(s)? - ### Mapping of Policies to Implementation - - -1. How do you track implementation? - +1. How do you track implementation? 2. What is your development/implementation methodology? - 1. How do you take high level policy requirements and create implementable units of work? - - -3. How do you track and plan implementation/development? - +3. How do you track and plan implementation/development? ### Evidence of Implementation fulfilling Policies - 1. When a policy is implemented how is this validated? - 2. Is this validation recorded? - 3. Can an auditor access this record? - ## High-Level General Practices - ### Supply Management - - -1. Who supplies your Kubernetes cluster(s) infrastructure today? - +1. Who supplies your cloud infrastructure today? 2. Is the underlying cloud providers infrastructure in line with your compliance requirements? - 3. Are the underlying VMs / load-balancers / storage sufficiently protected? (e.g., via firewalls) - -4. Is your Kubernetes cluster connected to any managed services? (e.g., database-as-a-service, logging-as-a-service, incident-management-as-a-service) - +4. Is your application connected to any managed services? (e.g., database-as-a-service, logging-as-a-service, incident-management-as-a-service) ### Separation of testing and production - - 1. Do you separate testing from production clusters? - 2. Do you take production data into testing? - 3. What is your source of test data? - 4. Is your testing infrastructure sufficiently protected? - ### Access Control - - -1. Describe your access control system for Kubernetes? - +1. Describe your access control system? 2. Is multi-factor authentication enabled? - -3. How many people have access to the production Kubernetes cluster? - +3. How many people have access to the production cluster? 1. Is role based access implemented? (Read-only, Read-Write) - - 4. Do you have one user account per person? - -5. Do you keep an audit trail on Kubernetes API access? - +5. Do you keep an audit trail on access to the production cluster? 6. Is the audit trail stored in a tamper-proof logging[^1] environment? - 7. Do you regularly review access, e.g., revoking access to people leaving your organization? - ### Logging - - -1. Do you forward nodes (journald), Kubernetes (API server) and application logs to a tamper-proof logging environment? - +1. Do you forward nodes (journald), cloud platform and application logs to a tamper-proof logging environment? 2. What is your retention policy? What is the minimum amount of time you keep logging entries? What is the maximum amount of time you keep logging entries? - 1. Source of retention policy? Specified in legislation or company policy? If company policy, why that figure? - - ### Backups - - -1. Do you perform regular backups of your production Kubernetes clusters? - +1. Do you perform regular backups of your production clusters? 2. Do you regularly test restoring from backups? - ### Change Management - - 1. What is the journey of code changes? (e.g., How does a new feature make it into production?) - -2. What is the journey of infrastructure changes? (e.g., How do you create a new Kubernetes cluster? How do you update a SecurityGroup?) - +2. What is the journey of infrastructure changes? (e.g., How do you create a new cluster? How do you update a SecurityGroup?) 3. What is the journey of data changes? (e.g., How do you add a new column to your database?) - 4. Do you enforce a 2-person policy for performing system changes? - 5. Do you use [semantic versioning](https://semver.org/) for container image tags? - 6. How do you perform data migration? - 7. In the event of a ‘bad’ deployment, can you rollback to a previous good state? - 8. Is this rollback ability tested regularly? - 9. Do you do Canary deployments? - ### Technical Vulnerability Management - - 1. Do you scan containers for vulnerabilities before entering production? - 2. Do you have a process in place to get alerted when a container becomes vulnerable in production? - 3. Do you enforce deployment only from known-good container image registries? - 4. After how much time do you fix known container image vulnerabilities? - -5. Is the production Kubernetes cluster updated, as required to avoid vulnerabilities? - +5. Is the production cluster updated, as required to avoid vulnerabilities? 6. Is the underlying OS image updated, as required to avoid vulnerabilities? - 7. Are other adjacent services (e.g., container registry, logging environment, identity provider) updated, as required to avoid vulnerabilities? - ### Use of Cryptography - - 1. Do you encrypt traffic over open networks? Do you use HTTPS over the Internet? - 2. Do you regularly rotate cryptographic keys? - 3. Do you use HSTS[^2]? - ### Network Segregation - - -1. Do you have NetworkPolicies in place? - +1. Do you use network segregation? ### Intrusion Detection / Prevention +1. Do you have an intrusion detection system in the production cluster? - -1. Do you have an intrusion detection system in the Kubernetes cluster? - - -2. Do you have a web application firewall in front or within the Kubernetes cluster? - +2. Do you have a web application firewall in front or within the production cluster? 3. Do you alert on blocked outbound traffic? - ### Business Continuity +1. Are your systems designed with sufficient redundancy? (e.g., multiple availability zones, multiple servers, multiple application replicas) - -1. Are your systems designed with sufficient redundancy? (e.g., multiple availability zones, multiple Kubernetes nodes, multiple Pod replicas) - - -2. Do you regularly test your redundancy? (e.g., by failing a Kubernetes node and killing a Pod replica) - +2. Do you regularly test your redundancy? (e.g., by failing a server and killing an application replica) 3. Do at least two team members have access to each system? - ### Incident Management - - 1. Are your systems sufficiently documented for incident management? - 2. Do you have a clear definition of what is an incident? - 3. Does your team have clear procedures for who handles incidents and how to handle incidents? - 4. Are there situations where you need to fix data in a production environment? - 1. If yes, what is your process? - - #### Incident reporting: Internal - - 1. How do internal users report incidents? - 2. Is there an escalation process in place if an incident is not responded to in time? - #### Incident reporting: External - - 1. How do external users report incidents? - 2. Is there an escalation process in place if an incident is not responded to in time? +## Cluster Security -## Kubernetes Cluster Security - - - -1. Does your cluster pass the [CIS Kubernetes security benchmark](https://github.com/aquasecurity/kube-bench)? - - -2. Do you have RBAC enabled? - - -3. Do you use PodSecurityPolicy? - - -4. Do you use OpenPolicyAgent? - - -## Kubernetes Workload Security - - - -1. Do you enforce appropriate Pod SecurityContext? - - - 1. Do you enforce no privileged Pods? - 2. Do you enforce minimum Pod capabilities? - 3. Do you enforce no Pods running as root? - 4. Do you enforce no Pod sysctls? - 5. Do you enforce no hostPath-s? - 6. Do you enforce no nodePort / host network? - - - -2. Do you enforce no [automountServiceAccountToken](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/)? - - -3. Do you enforce Pod resource limits? - - -4. Do you enforce minimum Pod replication? - - -5. Do you enforce a read-only Pod file system? - - -6. Are all Pods properly labeled? - - -7. Are Pod container images pinned to a specific version (i.e., no “latest”)? - - -8. Do you use Secrets appropriately? - - -9. Do you restrict access to Secrets? - - -10. Do you enforce Ingress TLS? - - -11. Do you have sufficient and tested NetworkPolicies in place? - - -12. Did you run through other [Kubernetes Best Practices](https://learnk8s.io/production-best-practices)? - - -## Kubernetes Development Practices +1. Do you use Role-Based Access Control? +2. Do you use OpenPolicyAgent? +## Development Practices 1. How is a container image built? +2. How is a container image deployed into the production cluster? -2. How is a container image deployed into the Kubernetes cluster? - - -3. How do you manage Kubernetes resources? (e.g., Helm) - +3. How do you manage deployments? 4. Do you practice GitOps[^3]? - 5. Do you fully separate development from production environments? - 6. Is there a fully separate environment for testing/QA? - ### Development Environment - #### Access control - - 1. How is access managed to this environment? (Role based, per user etc) - 2. Describe your source control systems and process - ### Testing/QA Environment - #### Access control - - 1. How is access managed to this environment? (Role based, per user etc) - #### Test data - - 1. How is test data created/sourced? - 2. Does data come from production data? - 1. If yes, is it anonymised? - - ### Production Environment - #### Access control - - 1. How is access managed to this environment? (Role based, per user etc) - ### Code/build Deployment +##### Application -##### Kubernetes - - - -1. Describe your pod deployment process - - -2. What checks do you carry out on pods at deployment time? +1. Describe your application deployment process +2. What checks do you carry out on your application at deployment time? ##### Volume mounts/Data sources - - 1. Describe your deployment/update process for data sources. - 2. Is your data persistence layer built using the Infrastructure as code paradigm? - 3. Is your data persistence layer version controlled? - #### Permissions - ##### Automated/manual system - -## Kubernetes Operational Practices - +## Operational Practices ### Health Checks - - -1. Do you maintain KPIs for determining if your Kubernetes cluster is working well? (e.g., USE: utilization, saturation, errors) - +1. Do you maintain KPIs for determining if your production cluster is working well? (e.g., USE: utilization, saturation, errors) 2. Do you have relevant alerts in place? - 3. Do you have alerts for “slowly filling and getting full”? (e.g., disk space) - -4. Do you require [Liveness, Readiness and Startup Probes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/)? - +4. Do you require Liveness, Readiness and Startup Probes? ### Operational Logging and Alerts - - 1. Do you have relevant alerts in place? - 2. Do you have a defined process for handling alerts? - 3. What is the retention period for these logs? - #### SLA’s/SLO’s management - - 1. Do you maintain KPIs for determining if your users are served well? (e.g., daily active users, log-ins per day, registrations per day) - 2. Do you maintain KPIs for determining if your application is working well? (e.g., RED: request rate, error rate, duration) - #### Path to resolution Describe the chain of events from an alert/issue raised to resolution - ### Security Logging and Alerts - - 1. Do you have relevant alerts in place? - 2. Do you have a defined process for handling alerts? - 3. What is the retention period for these logs? - #### Triage - - 1. Do you have processes and systems in place for evaluating security incidents? (Info, Low, Medium, High, Critical) - 2. Do you have processes and systems in place for informing/alerting external shareholders of relevant incidents? (Data leaks, System availability etc) - #### Path to resolution Describe the chain of events from an alert/issue raised to resolution - ## Notes [^1]: - By “tamper-proof” it is understood that a person gaining (authorized or unauthorized access) to the production Kubernetes cluster cannot modify or delete existing log entries. + By “tamper-proof” it is understood that a person gaining (authorized or unauthorized access) to the production cluster cannot modify or delete existing log entries. [^2]: https://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security [^3]: - By “GitOps” we mean that all system changes (except in “break glass” scenarios) need to be performed via git commits. \ No newline at end of file + By “GitOps” we mean that all system changes (except in “break glass” scenarios) need to be performed via git commits. From c1c63ddfe61490d05e8e394024471047a3fdf172 Mon Sep 17 00:00:00 2001 From: raviranjanelastisys <75951259+raviranjanelastisys@users.noreply.github.com> Date: Thu, 3 Jun 2021 15:46:46 +0200 Subject: [PATCH 02/18] Update README.md --- README.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index b8f9942..b73dc29 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# Cloud Information Security Review Checklist +# Google Cloud Information Security Review Checklist ## Governance, risk management, and compliance @@ -43,13 +43,13 @@ What policy statements do you have in place to show your compliance with the abo ### Supply Management -1. Who supplies your cloud infrastructure today? +1. Is google cloud infrastructure in line with your compliance requirements? -2. Is the underlying cloud providers infrastructure in line with your compliance requirements? +2. Is there a multi-cloud architecture ? if yes , what other cloud provided services used ? 3. Are the underlying VMs / load-balancers / storage sufficiently protected? (e.g., via firewalls) -4. Is your application connected to any managed services? (e.g., database-as-a-service, logging-as-a-service, incident-management-as-a-service) +4. Is your application connected to any google cloud managed services? (e.g., database-as-a-service, logging-as-a-service, incident-management-as-a-service) ### Separation of testing and production @@ -81,7 +81,7 @@ What policy statements do you have in place to show your compliance with the abo ### Logging -1. Do you forward nodes (journald), cloud platform and application logs to a tamper-proof logging environment? +1. Do you forward nodes (journald), google cloud platform and application logs to a tamper-proof logging environment? 2. What is your retention policy? What is the minimum amount of time you keep logging entries? What is the maximum amount of time you keep logging entries? @@ -117,7 +117,7 @@ What policy statements do you have in place to show your compliance with the abo 1. Do you scan containers for vulnerabilities before entering production? -2. Do you have a process in place to get alerted when a container becomes vulnerable in production? +2. Do you have a process in place to get alerted when a container/VM's becomes vulnerable in production? 3. Do you enforce deployment only from known-good container image registries? From d3230608be244949d78671b16b7a9877d163ac8f Mon Sep 17 00:00:00 2001 From: raviranjanelastisys <75951259+raviranjanelastisys@users.noreply.github.com> Date: Thu, 3 Jun 2021 16:00:23 +0200 Subject: [PATCH 03/18] Update README.md --- README.md | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/README.md b/README.md index b73dc29..12d0a65 100644 --- a/README.md +++ b/README.md @@ -53,7 +53,7 @@ What policy statements do you have in place to show your compliance with the abo ### Separation of testing and production -1. Do you separate testing from production clusters? +1. Do you separate testing from production google cloud clusters? 2. Do you take production data into testing? @@ -67,13 +67,13 @@ What policy statements do you have in place to show your compliance with the abo 2. Is multi-factor authentication enabled? -3. How many people have access to the production cluster? +3. How many people have access to the google cloud production cluster? 1. Is role based access implemented? (Read-only, Read-Write) 4. Do you have one user account per person? -5. Do you keep an audit trail on access to the production cluster? +5. Do you keep an audit trail on access to the google cloud production cluster? 6. Is the audit trail stored in a tamper-proof logging[^1] environment? @@ -89,7 +89,7 @@ What policy statements do you have in place to show your compliance with the abo ### Backups -1. Do you perform regular backups of your production clusters? +1. Do you perform regular backups of your google cloud production clusters? 2. Do you regularly test restoring from backups? @@ -115,9 +115,9 @@ What policy statements do you have in place to show your compliance with the abo ### Technical Vulnerability Management -1. Do you scan containers for vulnerabilities before entering production? +1. Do you scan containers for vulnerabilities before entering google cloud production? -2. Do you have a process in place to get alerted when a container/VM's becomes vulnerable in production? +2. Do you have a process in place to get alerted when a container/VM's becomes vulnerable in google cloud production? 3. Do you enforce deployment only from known-good container image registries? @@ -143,9 +143,9 @@ What policy statements do you have in place to show your compliance with the abo ### Intrusion Detection / Prevention -1. Do you have an intrusion detection system in the production cluster? +1. Do you have an intrusion detection system in the google cloud production cluster? -2. Do you have a web application firewall in front or within the production cluster? +2. Do you have a web application firewall in front or within the google cloud production cluster? 3. Do you alert on blocked outbound traffic? @@ -165,7 +165,7 @@ What policy statements do you have in place to show your compliance with the abo 3. Does your team have clear procedures for who handles incidents and how to handle incidents? -4. Are there situations where you need to fix data in a production environment? +4. Are there situations where you need to fix data in a google cloud production environment? 1. If yes, what is your process? @@ -191,7 +191,7 @@ What policy statements do you have in place to show your compliance with the abo 1. How is a container image built? -2. How is a container image deployed into the production cluster? +2. How is a container image deployed into the google cloud production cluster? 3. How do you manage deployments? @@ -205,7 +205,7 @@ What policy statements do you have in place to show your compliance with the abo #### Access control -1. How is access managed to this environment? (Role based, per user etc) +1. How is access managed to this google cloud environment? (Role based, per user etc) 2. Describe your source control systems and process @@ -213,13 +213,13 @@ What policy statements do you have in place to show your compliance with the abo #### Access control -1. How is access managed to this environment? (Role based, per user etc) +1. How is access managed to this google cloud Test/QA environment? (Role based, per user etc) #### Test data 1. How is test data created/sourced? -2. Does data come from production data? +2. Does data come from google cloud production data? 1. If yes, is it anonymised? @@ -227,7 +227,7 @@ What policy statements do you have in place to show your compliance with the abo #### Access control -1. How is access managed to this environment? (Role based, per user etc) +1. How is access managed to this google cloud production environment? (Role based, per user etc) ### Code/build Deployment @@ -253,7 +253,7 @@ What policy statements do you have in place to show your compliance with the abo ### Health Checks -1. Do you maintain KPIs for determining if your production cluster is working well? (e.g., USE: utilization, saturation, errors) +1. Do you maintain KPIs for determining if your google cloud production cluster is working well? (e.g., USE: utilization, saturation, errors) 2. Do you have relevant alerts in place? From 7ecab9f158d11068d1caabe81ffa7847933b5a10 Mon Sep 17 00:00:00 2001 From: raviranjanelastisys <75951259+raviranjanelastisys@users.noreply.github.com> Date: Fri, 4 Jun 2021 09:54:11 +0200 Subject: [PATCH 04/18] Update README.md --- README.md | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/README.md b/README.md index 12d0a65..6574fb4 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# Google Cloud Information Security Review Checklist +# Cloud Information Security Review Checklist ## Governance, risk management, and compliance @@ -43,17 +43,17 @@ What policy statements do you have in place to show your compliance with the abo ### Supply Management -1. Is google cloud infrastructure in line with your compliance requirements? +1. Is cloud infrastructure in line with your compliance requirements? 2. Is there a multi-cloud architecture ? if yes , what other cloud provided services used ? 3. Are the underlying VMs / load-balancers / storage sufficiently protected? (e.g., via firewalls) -4. Is your application connected to any google cloud managed services? (e.g., database-as-a-service, logging-as-a-service, incident-management-as-a-service) +4. Is your application connected to any managed services? (e.g., database-as-a-service, logging-as-a-service, incident-management-as-a-service) ### Separation of testing and production -1. Do you separate testing from production google cloud clusters? +1. Do you separate testing from production cloud clusters? 2. Do you take production data into testing? @@ -67,13 +67,13 @@ What policy statements do you have in place to show your compliance with the abo 2. Is multi-factor authentication enabled? -3. How many people have access to the google cloud production cluster? +3. How many people have access to the production cluster? 1. Is role based access implemented? (Read-only, Read-Write) 4. Do you have one user account per person? -5. Do you keep an audit trail on access to the google cloud production cluster? +5. Do you keep an audit trail on access to the production cluster? 6. Is the audit trail stored in a tamper-proof logging[^1] environment? @@ -81,7 +81,7 @@ What policy statements do you have in place to show your compliance with the abo ### Logging -1. Do you forward nodes (journald), google cloud platform and application logs to a tamper-proof logging environment? +1. Do you forward nodes (journald), platform and application logs to a tamper-proof logging environment? 2. What is your retention policy? What is the minimum amount of time you keep logging entries? What is the maximum amount of time you keep logging entries? @@ -89,7 +89,7 @@ What policy statements do you have in place to show your compliance with the abo ### Backups -1. Do you perform regular backups of your google cloud production clusters? +1. Do you perform regular backups of your production clusters? 2. Do you regularly test restoring from backups? @@ -115,9 +115,9 @@ What policy statements do you have in place to show your compliance with the abo ### Technical Vulnerability Management -1. Do you scan containers for vulnerabilities before entering google cloud production? +1. Do you scan containers for vulnerabilities before entering production? -2. Do you have a process in place to get alerted when a container/VM's becomes vulnerable in google cloud production? +2. Do you have a process in place to get alerted when a container/VM's becomes vulnerable in production? 3. Do you enforce deployment only from known-good container image registries? @@ -143,9 +143,9 @@ What policy statements do you have in place to show your compliance with the abo ### Intrusion Detection / Prevention -1. Do you have an intrusion detection system in the google cloud production cluster? +1. Do you have an intrusion detection system in the production cluster? -2. Do you have a web application firewall in front or within the google cloud production cluster? +2. Do you have a web application firewall in front or within the production cluster? 3. Do you alert on blocked outbound traffic? @@ -165,7 +165,7 @@ What policy statements do you have in place to show your compliance with the abo 3. Does your team have clear procedures for who handles incidents and how to handle incidents? -4. Are there situations where you need to fix data in a google cloud production environment? +4. Are there situations where you need to fix data in a production environment? 1. If yes, what is your process? @@ -191,7 +191,7 @@ What policy statements do you have in place to show your compliance with the abo 1. How is a container image built? -2. How is a container image deployed into the google cloud production cluster? +2. How is a container image deployed into the production cluster? 3. How do you manage deployments? @@ -205,7 +205,7 @@ What policy statements do you have in place to show your compliance with the abo #### Access control -1. How is access managed to this google cloud environment? (Role based, per user etc) +1. How is access managed to this environment? (Role based, per user etc) 2. Describe your source control systems and process @@ -213,13 +213,13 @@ What policy statements do you have in place to show your compliance with the abo #### Access control -1. How is access managed to this google cloud Test/QA environment? (Role based, per user etc) +1. How is access managed to this Test/QA environment? (Role based, per user etc) #### Test data 1. How is test data created/sourced? -2. Does data come from google cloud production data? +2. Does data come from production data? 1. If yes, is it anonymised? @@ -227,7 +227,7 @@ What policy statements do you have in place to show your compliance with the abo #### Access control -1. How is access managed to this google cloud production environment? (Role based, per user etc) +1. How is access managed to this production environment? (Role based, per user etc) ### Code/build Deployment @@ -253,7 +253,7 @@ What policy statements do you have in place to show your compliance with the abo ### Health Checks -1. Do you maintain KPIs for determining if your google cloud production cluster is working well? (e.g., USE: utilization, saturation, errors) +1. Do you maintain KPIs for determining if your production cluster is working well? (e.g., USE: utilization, saturation, errors) 2. Do you have relevant alerts in place? From 21ec6dd1eb2060999d32e49c16cc2fd72c0ae6fa Mon Sep 17 00:00:00 2001 From: raviranjanelastisys <75951259+raviranjanelastisys@users.noreply.github.com> Date: Fri, 4 Jun 2021 10:38:38 +0200 Subject: [PATCH 05/18] Update README.md --- README.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/README.md b/README.md index 6574fb4..5e9ccb0 100644 --- a/README.md +++ b/README.md @@ -129,6 +129,15 @@ What policy statements do you have in place to show your compliance with the abo 7. Are other adjacent services (e.g., container registry, logging environment, identity provider) updated, as required to avoid vulnerabilities? + +### Case-1- Cloud Run Container Security + +1. What is the source of base image ? Is is signed one ? Do we have a lean base image +2. How vulnerabilities found at Non-OS level (Python, npm, ruby gems, etc) +3. Is your container follows CIS benchmark ? +4. Is there any extra packages in container that can be security vulnerabilities ? +5. Is your container running as a user ? + ### Use of Cryptography 1. Do you encrypt traffic over open networks? Do you use HTTPS over the Internet? From c968c038075cd265bf01f775b4dad413bb0b4be7 Mon Sep 17 00:00:00 2001 From: raviranjanelastisys <75951259+raviranjanelastisys@users.noreply.github.com> Date: Fri, 4 Jun 2021 11:03:43 +0200 Subject: [PATCH 06/18] Update README.md --- README.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/README.md b/README.md index 5e9ccb0..c661d06 100644 --- a/README.md +++ b/README.md @@ -138,6 +138,11 @@ What policy statements do you have in place to show your compliance with the abo 4. Is there any extra packages in container that can be security vulnerabilities ? 5. Is your container running as a user ? +### Case-2- Cloud Run Authentication + +1. Do we have Identity provider access control in place ? Is it both for the user and other service ? +2. Do we have the service account for cloud run service ? If so , what permissions are provided ? + ### Use of Cryptography 1. Do you encrypt traffic over open networks? Do you use HTTPS over the Internet? From d5337dc9c990a018a4a8a493c7a17a021fcb61b5 Mon Sep 17 00:00:00 2001 From: raviranjanelastisys <75951259+raviranjanelastisys@users.noreply.github.com> Date: Fri, 4 Jun 2021 13:16:30 +0200 Subject: [PATCH 07/18] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index c661d06..82c0c4a 100644 --- a/README.md +++ b/README.md @@ -142,6 +142,7 @@ What policy statements do you have in place to show your compliance with the abo 1. Do we have Identity provider access control in place ? Is it both for the user and other service ? 2. Do we have the service account for cloud run service ? If so , what permissions are provided ? +3. How access tokens used to authenticate when calling Google Cloud APIs ? ### Use of Cryptography From a3a010d79b3e22e0e4cfc88ec970fa7f562dd1b0 Mon Sep 17 00:00:00 2001 From: raviranjanelastisys <75951259+raviranjanelastisys@users.noreply.github.com> Date: Fri, 4 Jun 2021 14:00:31 +0200 Subject: [PATCH 08/18] Update README.md --- README.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 82c0c4a..5a50e6b 100644 --- a/README.md +++ b/README.md @@ -141,8 +141,11 @@ What policy statements do you have in place to show your compliance with the abo ### Case-2- Cloud Run Authentication 1. Do we have Identity provider access control in place ? Is it both for the user and other service ? -2. Do we have the service account for cloud run service ? If so , what permissions are provided ? -3. How access tokens used to authenticate when calling Google Cloud APIs ? +3. Do we have the service account for cloud run service ? If so , what permissions are provided ? +4. How access tokens used to authenticate when calling Google Cloud APIs ? +5. How is the secrets manages in cloud run ? Is secret managed by secret manager ? +6. Do we have customer managed encrytion keys ? +7. Is cloud run integrated with Binarization autherization ? Or code is binarized and then deployed ? ### Use of Cryptography From f028e17600b915552939d37445ef964b340b89de Mon Sep 17 00:00:00 2001 From: Jakub Krzywda Date: Tue, 8 Jun 2021 10:34:15 +0200 Subject: [PATCH 09/18] Remove general parts --- README.md | 338 +++--------------------------------------------------- 1 file changed, 14 insertions(+), 324 deletions(-) diff --git a/README.md b/README.md index 5a50e6b..b28679e 100644 --- a/README.md +++ b/README.md @@ -1,328 +1,18 @@ -# Cloud Information Security Review Checklist +# Google Run Security -## Governance, risk management, and compliance +## Cloud Run Container Security -1. What regulations / information security standards do you need to comply with? +1. What is the source of your base image? Is is a signed one? Do we have a lean base image? +1. How are vulnerabilities found at Non-OS level (Python, npm, ruby gems, etc.)? +1. Deos your container follow CIS benchmark? +1. Are there any extra packages in container that can be security vulnerabilities? +1. Are your containers running as a non-root user? - 1. ☐ FFFS 2014:7 - 2. ☐ SOC-2 - 3. ☐ PCI-DSS - 4. ☐ ISO 27001 - 5. ☐ HIPAA - 6. ☐ Swedish HealthCare - 7. ☐ GDPR - 8. Other +## Cloud Run Authentication -2. Who/what role in your organisation is responsible for compliance? - -3. Does your organization have general information security policies outside of your compliance requirements in place? (e.g., “We need to store data in Brazil to appeal to national sentiments.”) - -### Mapping of Standards to Policies - -What policy statements do you have in place to show your compliance with the above standard(s)? - -### Mapping of Policies to Implementation - -1. How do you track implementation? - -2. What is your development/implementation methodology? - - 1. How do you take high level policy requirements and create implementable units of work? - -3. How do you track and plan implementation/development? - -### Evidence of Implementation fulfilling Policies - -1. When a policy is implemented how is this validated? - -2. Is this validation recorded? - -3. Can an auditor access this record? - -## High-Level General Practices - -### Supply Management - -1. Is cloud infrastructure in line with your compliance requirements? - -2. Is there a multi-cloud architecture ? if yes , what other cloud provided services used ? - -3. Are the underlying VMs / load-balancers / storage sufficiently protected? (e.g., via firewalls) - -4. Is your application connected to any managed services? (e.g., database-as-a-service, logging-as-a-service, incident-management-as-a-service) - -### Separation of testing and production - -1. Do you separate testing from production cloud clusters? - -2. Do you take production data into testing? - -3. What is your source of test data? - -4. Is your testing infrastructure sufficiently protected? - -### Access Control - -1. Describe your access control system? - -2. Is multi-factor authentication enabled? - -3. How many people have access to the production cluster? - - 1. Is role based access implemented? (Read-only, Read-Write) - -4. Do you have one user account per person? - -5. Do you keep an audit trail on access to the production cluster? - -6. Is the audit trail stored in a tamper-proof logging[^1] environment? - -7. Do you regularly review access, e.g., revoking access to people leaving your organization? - -### Logging - -1. Do you forward nodes (journald), platform and application logs to a tamper-proof logging environment? - -2. What is your retention policy? What is the minimum amount of time you keep logging entries? What is the maximum amount of time you keep logging entries? - - 1. Source of retention policy? Specified in legislation or company policy? If company policy, why that figure? - -### Backups - -1. Do you perform regular backups of your production clusters? - -2. Do you regularly test restoring from backups? - -### Change Management - -1. What is the journey of code changes? (e.g., How does a new feature make it into production?) - -2. What is the journey of infrastructure changes? (e.g., How do you create a new cluster? How do you update a SecurityGroup?) - -3. What is the journey of data changes? (e.g., How do you add a new column to your database?) - -4. Do you enforce a 2-person policy for performing system changes? - -5. Do you use [semantic versioning](https://semver.org/) for container image tags? - -6. How do you perform data migration? - -7. In the event of a ‘bad’ deployment, can you rollback to a previous good state? - -8. Is this rollback ability tested regularly? - -9. Do you do Canary deployments? - -### Technical Vulnerability Management - -1. Do you scan containers for vulnerabilities before entering production? - -2. Do you have a process in place to get alerted when a container/VM's becomes vulnerable in production? - -3. Do you enforce deployment only from known-good container image registries? - -4. After how much time do you fix known container image vulnerabilities? - -5. Is the production cluster updated, as required to avoid vulnerabilities? - -6. Is the underlying OS image updated, as required to avoid vulnerabilities? - -7. Are other adjacent services (e.g., container registry, logging environment, identity provider) updated, as required to avoid vulnerabilities? - - -### Case-1- Cloud Run Container Security - -1. What is the source of base image ? Is is signed one ? Do we have a lean base image -2. How vulnerabilities found at Non-OS level (Python, npm, ruby gems, etc) -3. Is your container follows CIS benchmark ? -4. Is there any extra packages in container that can be security vulnerabilities ? -5. Is your container running as a user ? - -### Case-2- Cloud Run Authentication - -1. Do we have Identity provider access control in place ? Is it both for the user and other service ? -3. Do we have the service account for cloud run service ? If so , what permissions are provided ? -4. How access tokens used to authenticate when calling Google Cloud APIs ? -5. How is the secrets manages in cloud run ? Is secret managed by secret manager ? -6. Do we have customer managed encrytion keys ? -7. Is cloud run integrated with Binarization autherization ? Or code is binarized and then deployed ? - -### Use of Cryptography - -1. Do you encrypt traffic over open networks? Do you use HTTPS over the Internet? - -2. Do you regularly rotate cryptographic keys? - -3. Do you use HSTS[^2]? - -### Network Segregation - -1. Do you use network segregation? - -### Intrusion Detection / Prevention - -1. Do you have an intrusion detection system in the production cluster? - -2. Do you have a web application firewall in front or within the production cluster? - -3. Do you alert on blocked outbound traffic? - -### Business Continuity - -1. Are your systems designed with sufficient redundancy? (e.g., multiple availability zones, multiple servers, multiple application replicas) - -2. Do you regularly test your redundancy? (e.g., by failing a server and killing an application replica) - -3. Do at least two team members have access to each system? - -### Incident Management - -1. Are your systems sufficiently documented for incident management? - -2. Do you have a clear definition of what is an incident? - -3. Does your team have clear procedures for who handles incidents and how to handle incidents? - -4. Are there situations where you need to fix data in a production environment? - - 1. If yes, what is your process? - -#### Incident reporting: Internal - -1. How do internal users report incidents? - -2. Is there an escalation process in place if an incident is not responded to in time? - -#### Incident reporting: External - -1. How do external users report incidents? - -2. Is there an escalation process in place if an incident is not responded to in time? - -## Cluster Security - -1. Do you use Role-Based Access Control? - -2. Do you use OpenPolicyAgent? - -## Development Practices - -1. How is a container image built? - -2. How is a container image deployed into the production cluster? - -3. How do you manage deployments? - -4. Do you practice GitOps[^3]? - -5. Do you fully separate development from production environments? - -6. Is there a fully separate environment for testing/QA? - -### Development Environment - -#### Access control - -1. How is access managed to this environment? (Role based, per user etc) - -2. Describe your source control systems and process - -### Testing/QA Environment - -#### Access control - -1. How is access managed to this Test/QA environment? (Role based, per user etc) - -#### Test data - -1. How is test data created/sourced? - -2. Does data come from production data? - - 1. If yes, is it anonymised? - -### Production Environment - -#### Access control - -1. How is access managed to this production environment? (Role based, per user etc) - -### Code/build Deployment - -##### Application - -1. Describe your application deployment process - -2. What checks do you carry out on your application at deployment time? - -##### Volume mounts/Data sources - -1. Describe your deployment/update process for data sources. - -2. Is your data persistence layer built using the Infrastructure as code paradigm? - -3. Is your data persistence layer version controlled? - -#### Permissions - -##### Automated/manual system - -## Operational Practices - -### Health Checks - -1. Do you maintain KPIs for determining if your production cluster is working well? (e.g., USE: utilization, saturation, errors) - -2. Do you have relevant alerts in place? - -3. Do you have alerts for “slowly filling and getting full”? (e.g., disk space) - -4. Do you require Liveness, Readiness and Startup Probes? - -### Operational Logging and Alerts - -1. Do you have relevant alerts in place? - -2. Do you have a defined process for handling alerts? - -3. What is the retention period for these logs? - -#### SLA’s/SLO’s management - -1. Do you maintain KPIs for determining if your users are served well? (e.g., daily active users, log-ins per day, registrations per day) - -2. Do you maintain KPIs for determining if your application is working well? (e.g., RED: request rate, error rate, duration) - -#### Path to resolution - -Describe the chain of events from an alert/issue raised to resolution - -### Security Logging and Alerts - -1. Do you have relevant alerts in place? - -2. Do you have a defined process for handling alerts? - -3. What is the retention period for these logs? - -#### Triage - -1. Do you have processes and systems in place for evaluating security incidents? (Info, Low, Medium, High, Critical) - -2. Do you have processes and systems in place for informing/alerting external shareholders of relevant incidents? (Data leaks, System availability etc) - -#### Path to resolution - -Describe the chain of events from an alert/issue raised to resolution - - -## Notes - -[^1]: - By “tamper-proof” it is understood that a person gaining (authorized or unauthorized access) to the production cluster cannot modify or delete existing log entries. - -[^2]: - https://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security - -[^3]: - By “GitOps” we mean that all system changes (except in “break glass” scenarios) need to be performed via git commits. +1. Do you have Identity provider access control in place? Is it both for the user and other service? +1. Do you have the service account for cloud run service? If so, what permissions are provided? +1. How access tokens are used to authenticate when calling Google Cloud APIs? +1. How are the secrets manage in Cloud Run? Are secrets managed by the Secret Manager? +1. Do you have customer managed encrytion keys? +1. Is Cloud Run integrated with Binarization autherization? Or is code binarized and then deployed? From 4aed9b39a85f25d8b66fea0221beb8c938b4cca9 Mon Sep 17 00:00:00 2001 From: Jakub Krzywda Date: Tue, 8 Jun 2021 10:35:37 +0200 Subject: [PATCH 10/18] Rename file --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b28679e..3b30b96 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# Google Run Security +# Google Cloud Run Security ## Cloud Run Container Security From 96e85870c32737087b7e3581f7665cae5e6fea3b Mon Sep 17 00:00:00 2001 From: Jakub Krzywda Date: Tue, 8 Jun 2021 10:36:40 +0200 Subject: [PATCH 11/18] Rename file --- google-cloud-run-security.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) create mode 100644 google-cloud-run-security.md diff --git a/google-cloud-run-security.md b/google-cloud-run-security.md new file mode 100644 index 0000000..3b30b96 --- /dev/null +++ b/google-cloud-run-security.md @@ -0,0 +1,18 @@ +# Google Cloud Run Security + +## Cloud Run Container Security + +1. What is the source of your base image? Is is a signed one? Do we have a lean base image? +1. How are vulnerabilities found at Non-OS level (Python, npm, ruby gems, etc.)? +1. Deos your container follow CIS benchmark? +1. Are there any extra packages in container that can be security vulnerabilities? +1. Are your containers running as a non-root user? + +## Cloud Run Authentication + +1. Do you have Identity provider access control in place? Is it both for the user and other service? +1. Do you have the service account for cloud run service? If so, what permissions are provided? +1. How access tokens are used to authenticate when calling Google Cloud APIs? +1. How are the secrets manage in Cloud Run? Are secrets managed by the Secret Manager? +1. Do you have customer managed encrytion keys? +1. Is Cloud Run integrated with Binarization autherization? Or is code binarized and then deployed? From 243ac4d14985c08e04279147bb8e6fd5e397e00e Mon Sep 17 00:00:00 2001 From: Jakub Krzywda Date: Tue, 8 Jun 2021 11:32:32 +0200 Subject: [PATCH 12/18] Fix spelling --- google-cloud-run-security.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/google-cloud-run-security.md b/google-cloud-run-security.md index 3b30b96..c2c6184 100644 --- a/google-cloud-run-security.md +++ b/google-cloud-run-security.md @@ -2,10 +2,10 @@ ## Cloud Run Container Security -1. What is the source of your base image? Is is a signed one? Do we have a lean base image? +1. What is the source of your base image? Is is a signed one? Do you have a lean base image? 1. How are vulnerabilities found at Non-OS level (Python, npm, ruby gems, etc.)? -1. Deos your container follow CIS benchmark? -1. Are there any extra packages in container that can be security vulnerabilities? +1. Does your container follow CIS benchmark? +1. Are there any extra packages in containers that can be security vulnerabilities? 1. Are your containers running as a non-root user? ## Cloud Run Authentication From 3504224ebfc9c053d71387be4b5e46ad442d84ee Mon Sep 17 00:00:00 2001 From: Jakub Krzywda Date: Tue, 8 Jun 2021 11:33:23 +0200 Subject: [PATCH 13/18] Revert K8s checklist to original state --- README.md | 550 ++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 536 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index 3b30b96..870a12a 100644 --- a/README.md +++ b/README.md @@ -1,18 +1,540 @@ -# Google Cloud Run Security +## Kubernetes Information Security Review Checklist -## Cloud Run Container Security -1. What is the source of your base image? Is is a signed one? Do we have a lean base image? -1. How are vulnerabilities found at Non-OS level (Python, npm, ruby gems, etc.)? -1. Deos your container follow CIS benchmark? -1. Are there any extra packages in container that can be security vulnerabilities? -1. Are your containers running as a non-root user? +## Governance, risk management, and compliance -## Cloud Run Authentication +1. What regulations / information security standards do you need to comply with? -1. Do you have Identity provider access control in place? Is it both for the user and other service? -1. Do you have the service account for cloud run service? If so, what permissions are provided? -1. How access tokens are used to authenticate when calling Google Cloud APIs? -1. How are the secrets manage in Cloud Run? Are secrets managed by the Secret Manager? -1. Do you have customer managed encrytion keys? -1. Is Cloud Run integrated with Binarization autherization? Or is code binarized and then deployed? + + 1. ☐ FFFS 2014:7 + 2. ☐ SOC-2 + 3. ☐ PCI-DSS + 4. ☐ ISO 27001 + 5. ☐ HIPAA + 6. ☐ Swedish HealthCare + 7. ☐ GDPR + 8. Other + + +2. Who/what role in your organisation is responsible for compliance? + + +3. Does your organization have general information security policies outside of your compliance requirements in place? (e.g., “We need to store data in Brazil to appeal to national sentiments.”) + + +### Mapping of Standards to Policies + +What policy statements do you have in place to show your compliance with the above standard(s)? + + +### Mapping of Policies to Implementation + + + +1. How do you track implementation? + + +2. What is your development/implementation methodology? + + + 1. How do you take high level policy requirements and create implementable units of work? + + + +3. How do you track and plan implementation/development? + + +### Evidence of Implementation fulfilling Policies + + +1. When a policy is implemented how is this validated? + + +2. Is this validation recorded? + + +3. Can an auditor access this record? + + +## High-Level General Practices + + +### Supply Management + + + +1. Who supplies your Kubernetes cluster(s) infrastructure today? + + +2. Is the underlying cloud providers infrastructure in line with your compliance requirements? + + +3. Are the underlying VMs / load-balancers / storage sufficiently protected? (e.g., via firewalls) + + +4. Is your Kubernetes cluster connected to any managed services? (e.g., database-as-a-service, logging-as-a-service, incident-management-as-a-service) + + +### Separation of testing and production + + + +1. Do you separate testing from production clusters? + + +2. Do you take production data into testing? + + +3. What is your source of test data? + + +4. Is your testing infrastructure sufficiently protected? + + +### Access Control + + + +1. Describe your access control system for Kubernetes? + + +2. Is multi-factor authentication enabled? + + +3. How many people have access to the production Kubernetes cluster? + + + 1. Is role based access implemented? (Read-only, Read-Write) + + + +4. Do you have one user account per person? + + +5. Do you keep an audit trail on Kubernetes API access? + + +6. Is the audit trail stored in a tamper-proof logging[^1] environment? + + +7. Do you regularly review access, e.g., revoking access to people leaving your organization? + + +### Logging + + + +1. Do you forward nodes (journald), Kubernetes (API server) and application logs to a tamper-proof logging environment? + + +2. What is your retention policy? What is the minimum amount of time you keep logging entries? What is the maximum amount of time you keep logging entries? + + + 1. Source of retention policy? Specified in legislation or company policy? If company policy, why that figure? + + + +### Backups + + + +1. Do you perform regular backups of your production Kubernetes clusters? + + +2. Do you regularly test restoring from backups? + + +### Change Management + + + +1. What is the journey of code changes? (e.g., How does a new feature make it into production?) + + +2. What is the journey of infrastructure changes? (e.g., How do you create a new Kubernetes cluster? How do you update a SecurityGroup?) + + +3. What is the journey of data changes? (e.g., How do you add a new column to your database?) + + +4. Do you enforce a 2-person policy for performing system changes? + + +5. Do you use [semantic versioning](https://semver.org/) for container image tags? + + +6. How do you perform data migration? + + +7. In the event of a ‘bad’ deployment, can you rollback to a previous good state? + + +8. Is this rollback ability tested regularly? + + +9. Do you do Canary deployments? + + +### Technical Vulnerability Management + + + +1. Do you scan containers for vulnerabilities before entering production? + + +2. Do you have a process in place to get alerted when a container becomes vulnerable in production? + + +3. Do you enforce deployment only from known-good container image registries? + + +4. After how much time do you fix known container image vulnerabilities? + + +5. Is the production Kubernetes cluster updated, as required to avoid vulnerabilities? + + +6. Is the underlying OS image updated, as required to avoid vulnerabilities? + + +7. Are other adjacent services (e.g., container registry, logging environment, identity provider) updated, as required to avoid vulnerabilities? + + +### Use of Cryptography + + + +1. Do you encrypt traffic over open networks? Do you use HTTPS over the Internet? + + +2. Do you regularly rotate cryptographic keys? + + +3. Do you use HSTS[^2]? + + +### Network Segregation + + + +1. Do you have NetworkPolicies in place? + + +### Intrusion Detection / Prevention + + + +1. Do you have an intrusion detection system in the Kubernetes cluster? + + +2. Do you have a web application firewall in front or within the Kubernetes cluster? + + +3. Do you alert on blocked outbound traffic? + + +### Business Continuity + + + +1. Are your systems designed with sufficient redundancy? (e.g., multiple availability zones, multiple Kubernetes nodes, multiple Pod replicas) + + +2. Do you regularly test your redundancy? (e.g., by failing a Kubernetes node and killing a Pod replica) + + +3. Do at least two team members have access to each system? + + +### Incident Management + + + +1. Are your systems sufficiently documented for incident management? + + +2. Do you have a clear definition of what is an incident? + + +3. Does your team have clear procedures for who handles incidents and how to handle incidents? + + +4. Are there situations where you need to fix data in a production environment? + + + 1. If yes, what is your process? + + + +#### Incident reporting: Internal + + + +1. How do internal users report incidents? + + +2. Is there an escalation process in place if an incident is not responded to in time? + + +#### Incident reporting: External + + + +1. How do external users report incidents? + + +2. Is there an escalation process in place if an incident is not responded to in time? + + +## Kubernetes Cluster Security + + + +1. Does your cluster pass the [CIS Kubernetes security benchmark](https://github.com/aquasecurity/kube-bench)? + + +2. Do you have RBAC enabled? + + +3. Do you use PodSecurityPolicy? + + +4. Do you use OpenPolicyAgent? + + +## Kubernetes Workload Security + + + +1. Do you enforce appropriate Pod SecurityContext? + + + 1. Do you enforce no privileged Pods? + 2. Do you enforce minimum Pod capabilities? + 3. Do you enforce no Pods running as root? + 4. Do you enforce no Pod sysctls? + 5. Do you enforce no hostPath-s? + 6. Do you enforce no nodePort / host network? + + + +2. Do you enforce no [automountServiceAccountToken](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/)? + + +3. Do you enforce Pod resource limits? + + +4. Do you enforce minimum Pod replication? + + +5. Do you enforce a read-only Pod file system? + + +6. Are all Pods properly labeled? + + +7. Are Pod container images pinned to a specific version (i.e., no “latest”)? + + +8. Do you use Secrets appropriately? + + +9. Do you restrict access to Secrets? + + +10. Do you enforce Ingress TLS? + + +11. Do you have sufficient and tested NetworkPolicies in place? + + +12. Did you run through other [Kubernetes Best Practices](https://learnk8s.io/production-best-practices)? + + +## Kubernetes Development Practices + + + +1. How is a container image built? + + +2. How is a container image deployed into the Kubernetes cluster? + + +3. How do you manage Kubernetes resources? (e.g., Helm) + + +4. Do you practice GitOps[^3]? + + +5. Do you fully separate development from production environments? + + +6. Is there a fully separate environment for testing/QA? + + +### Development Environment + + +#### Access control + + + +1. How is access managed to this environment? (Role based, per user etc) + + +2. Describe your source control systems and process + + +### Testing/QA Environment + + +#### Access control + + + +1. How is access managed to this environment? (Role based, per user etc) + + +#### Test data + + + +1. How is test data created/sourced? + + +2. Does data come from production data? + + + 1. If yes, is it anonymised? + + + +### Production Environment + + +#### Access control + + + +1. How is access managed to this environment? (Role based, per user etc) + + +### Code/build Deployment + + +##### Kubernetes + + + +1. Describe your pod deployment process + + +2. What checks do you carry out on pods at deployment time? + + +##### Volume mounts/Data sources + + + +1. Describe your deployment/update process for data sources. + + +2. Is your data persistence layer built using the Infrastructure as code paradigm? + + +3. Is your data persistence layer version controlled? + + +#### Permissions + + +##### Automated/manual system + + +## Kubernetes Operational Practices + + +### Health Checks + + + +1. Do you maintain KPIs for determining if your Kubernetes cluster is working well? (e.g., USE: utilization, saturation, errors) + + +2. Do you have relevant alerts in place? + + +3. Do you have alerts for “slowly filling and getting full”? (e.g., disk space) + + +4. Do you require [Liveness, Readiness and Startup Probes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/)? + + +### Operational Logging and Alerts + + + +1. Do you have relevant alerts in place? + + +2. Do you have a defined process for handling alerts? + + +3. What is the retention period for these logs? + + +#### SLA’s/SLO’s management + + + +1. Do you maintain KPIs for determining if your users are served well? (e.g., daily active users, log-ins per day, registrations per day) + + +2. Do you maintain KPIs for determining if your application is working well? (e.g., RED: request rate, error rate, duration) + + +#### Path to resolution + +Describe the chain of events from an alert/issue raised to resolution + + +### Security Logging and Alerts + + + +1. Do you have relevant alerts in place? + + +2. Do you have a defined process for handling alerts? + + +3. What is the retention period for these logs? + + +#### Triage + + + +1. Do you have processes and systems in place for evaluating security incidents? (Info, Low, Medium, High, Critical) + + +2. Do you have processes and systems in place for informing/alerting external shareholders of relevant incidents? (Data leaks, System availability etc) + + +#### Path to resolution + +Describe the chain of events from an alert/issue raised to resolution + + + +## Notes + +[^1]: + By “tamper-proof” it is understood that a person gaining (authorized or unauthorized access) to the production Kubernetes cluster cannot modify or delete existing log entries. + +[^2]: + https://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security + +[^3]: + By “GitOps” we mean that all system changes (except in “break glass” scenarios) need to be performed via git commits. \ No newline at end of file From 3dcea1592f9eb0cfb444bb440a8b19ae629b56e9 Mon Sep 17 00:00:00 2001 From: Jakub Krzywda Date: Tue, 15 Jun 2021 15:04:22 +0200 Subject: [PATCH 14/18] Split image questions --- google-cloud-run-security.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/google-cloud-run-security.md b/google-cloud-run-security.md index c2c6184..03fb2d5 100644 --- a/google-cloud-run-security.md +++ b/google-cloud-run-security.md @@ -2,7 +2,8 @@ ## Cloud Run Container Security -1. What is the source of your base image? Is is a signed one? Do you have a lean base image? +1. What is the source of your base image? +1. Is the Docker image signed one? Do you use Docker content trust feature? 1. How are vulnerabilities found at Non-OS level (Python, npm, ruby gems, etc.)? 1. Does your container follow CIS benchmark? 1. Are there any extra packages in containers that can be security vulnerabilities? From 7d6b7009a83f60d46fc30c66ac868714c4f50c4e Mon Sep 17 00:00:00 2001 From: Jakub Krzywda Date: Tue, 15 Jun 2021 15:09:39 +0200 Subject: [PATCH 15/18] Docker Content Trust --- google-cloud-run-security.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/google-cloud-run-security.md b/google-cloud-run-security.md index 03fb2d5..9057cd8 100644 --- a/google-cloud-run-security.md +++ b/google-cloud-run-security.md @@ -3,7 +3,7 @@ ## Cloud Run Container Security 1. What is the source of your base image? -1. Is the Docker image signed one? Do you use Docker content trust feature? +1. Is the Docker image signed one? Do you use Docker Content Trust (DCT) feature? 1. How are vulnerabilities found at Non-OS level (Python, npm, ruby gems, etc.)? 1. Does your container follow CIS benchmark? 1. Are there any extra packages in containers that can be security vulnerabilities? From 4ebb5a401e9bc65972398f6a2130605308a529c6 Mon Sep 17 00:00:00 2001 From: Jakub Krzywda Date: Tue, 15 Jun 2021 15:12:11 +0200 Subject: [PATCH 16/18] Add link to docker bench security --- google-cloud-run-security.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/google-cloud-run-security.md b/google-cloud-run-security.md index 9057cd8..e9565dd 100644 --- a/google-cloud-run-security.md +++ b/google-cloud-run-security.md @@ -5,7 +5,7 @@ 1. What is the source of your base image? 1. Is the Docker image signed one? Do you use Docker Content Trust (DCT) feature? 1. How are vulnerabilities found at Non-OS level (Python, npm, ruby gems, etc.)? -1. Does your container follow CIS benchmark? +1. Does your container follow [CIS benchmark](https://github.com/docker/docker-bench-security)? 1. Are there any extra packages in containers that can be security vulnerabilities? 1. Are your containers running as a non-root user? From 684bb2fc4b4ec3a05360203eea11966a75bf20ae Mon Sep 17 00:00:00 2001 From: raviranjanelastisys <75951259+raviranjanelastisys@users.noreply.github.com> Date: Tue, 15 Jun 2021 18:53:51 +0530 Subject: [PATCH 17/18] Update google-cloud-run-security.md --- google-cloud-run-security.md | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/google-cloud-run-security.md b/google-cloud-run-security.md index e9565dd..c0fe0bf 100644 --- a/google-cloud-run-security.md +++ b/google-cloud-run-security.md @@ -2,12 +2,16 @@ ## Cloud Run Container Security -1. What is the source of your base image? -1. Is the Docker image signed one? Do you use Docker Content Trust (DCT) feature? -1. How are vulnerabilities found at Non-OS level (Python, npm, ruby gems, etc.)? -1. Does your container follow [CIS benchmark](https://github.com/docker/docker-bench-security)? -1. Are there any extra packages in containers that can be security vulnerabilities? -1. Are your containers running as a non-root user? +1.Container Base Image + - What is the source of your base image? + - Is the docker image signed one? Do you use Docker Content Trust feature ? + - Do you have a lean base image? +2. How are vulnerabilities found at Non-OS level (Python, npm, ruby gems, etc.)? +3. Does your container follow [CIS benchmark](https://github.com/docker/docker-bench-security)? +4. Are there any extra packages in containers that can be security vulnerabilities? +5. Are containers running as a non-root user? +6. How image ensures to avoid privilege escalation and how it deals with Linux capabilities. + ## Cloud Run Authentication From 18d5af7294e4e06047ef00f4a4fe2a2346d59b13 Mon Sep 17 00:00:00 2001 From: raviranjanelastisys <75951259+raviranjanelastisys@users.noreply.github.com> Date: Tue, 15 Jun 2021 18:54:28 +0530 Subject: [PATCH 18/18] Update google-cloud-run-security.md --- google-cloud-run-security.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/google-cloud-run-security.md b/google-cloud-run-security.md index c0fe0bf..d615a84 100644 --- a/google-cloud-run-security.md +++ b/google-cloud-run-security.md @@ -2,7 +2,7 @@ ## Cloud Run Container Security -1.Container Base Image +1. Container Base Image - What is the source of your base image? - Is the docker image signed one? Do you use Docker Content Trust feature ? - Do you have a lean base image?