Don't forget to create account on our site to get access to more material made only for free registered user.  

AWS Developer Certification : Associate Level AWS Sysops Administrator Certification : Assciate Level   AWS Solution Architect Certification : Associate Level  AWS Solution Architect Certification : Associate Level

Automated Monitoring:

1.    There are two types of status to check the health of your instance. System Status and Instance Status checks. (Learn with this video :

)

2.    System Status Checks - These checks detect problems with your instance that require AWS involvement to repair. When a system status check fails, you can choose to wait for AWS to fix the issue or you can resolve it yourself (for example, by stopping and restarting or terminating and replacing an instance). Examples of problems that cause system status checks to fail include:

a.    Loss of network connectivity

b.    Loss of system power

c.    Software issues on the physical host

d.    Hardware issues on the physical host

3.    Instance Status Checks - monitor the software and network configuration of your individual instance. These checks detect problems that require your involvement to repair. When an instance status check fails, typically you will need to address the problem yourself (for example by rebooting the instance or by making modifications in your operating system). Examples of problems that may cause instance status checks to fail include:

a.    Failed system status checks

b.    Misconfigured networking or startup configuration

c.    Exhausted memory

d.    Corrupted file system

e.    Incompatible kernel

4.    Amazon EC2 Monitoring Scripts (Best for custom monitoring) - Perl scripts that can monitor memory, disk, and swap file usage in your instances. 

5.    Amazon CloudWatch Logs - monitor, store, and access your log files from Amazon EC2 instances, AWS CloudTrail, or other sources.

 

Manual monitoring:

6.    Using Amazon EC2 Dashboard you can check:

a.    Service Health and Scheduled Events by region

b.    Instance state

c.    Status checks

d.    Alarm status

e.    Instance metric details (In the navigation pane click Instances, select an instance, and then click the Monitoring tab)

f.     Volume metric details (In the navigation pane click Volumes, select a volume, and then click the Monitoring tab)

7.    CloudWatch is an AWS service that automatically collects a wide range of performance and health data about your AWS resources.  This data is available through an API, and also can be viewed as graphs on the AWS console.  However the graphs are located on the separate console pages for each type of resource (e.g. EC2, RDS, load balancer, etc)

8.    Amazon CloudWatch Dashboard shows:

a.    Current alarms and status

b.    Graphs of alarms and resources

9.    Service health status

a.    You can monitor the status of your instances by

                                          i.    viewing status checks and

                                        ii.    scheduled events for your instances

b.    You can also see status on specific events scheduled for your instances. Events provide information about upcoming activities such as rebooting or retirement that are planned for your instances, along with the scheduled start and end time of each event.

10. A status check gives you the information that results from automated checks performed by Amazon EC2. 

a.    With instance status monitoring, you can quickly determine whether Amazon EC2 has detected any problems that might prevent your instances from running applications.

b.    Amazon EC2 performs automated checks on every running EC2 instance to identify hardware and software issues.

c.    AWS Cloudwatch monitoring :  CPU utilization, network traffic, and disk activity (not memory)

d.    Status checks are performed every minute and each returns a pass or a fail status. If all checks pass, the overall status of the instance is OK. If one or more checks fail, the overall status is impaired.

e.    Status checks are built into Amazon EC2, so they cannot be disabled or deleted. 

f.     You can, however create or delete alarms that are triggered based on the result of the status checks. For example, you can create an alarm to warn you if status checks fail on a specific instance.

10. Amazon EC2 supports the following types of scheduled events for your instances:

a.    Instance stop: The instance will be stopped. When you start it again, it's migrated to a new host computer. Applies only to instances backed by Amazon EBS.

b.    Instance retirement: The instance will be stopped or terminated.

c.    Reboot: Either the instance will be rebooted (instance reboot) or the host computer for the instance will be rebooted (system reboot).

d.    System maintenance: The instance might be temporarily affected by network maintenance or power maintenance.

11. When AWS detects irreparable failure of the underlying host computer for your instance, it schedules the instance to stop or terminate, depending on the type of root device for the instance. If the root device is an EBS volume, the instance is scheduled to stop. If the root device is an instance store volume, the instance is scheduled to terminate.

a.    Actions for Instances Backed by Amazon EBS: You can wait for the maintenance to occur as scheduled. Alternatively, you can stop and start the instance, which migrates it to a new host computer.

b.    Actions for Instances Backed by Instance Store: You can wait for the maintenance to occur as scheduled. Alternatively, if you want to maintain normal operation during a scheduled maintenance window, you can launch a replacement instance from your most recent AMI, migrate all necessary data to the replacement instance before the scheduled maintenance window, and then terminate the original instance.

12. By default, Amazon EC2 sends metric data to CloudWatch in 5-minute periods. To send metric data for your instance to CloudWatch in 1-minute periods, you can enable detailed monitoring on the instance.

a.    Basic: Data is available automatically in 5-minute periods at no charge.

b.    Detailed: Data is available in 1-minute periods for an additional cost. To get this level of data, you must specifically enable it for the instance. For the instances where you've enabled detailed monitoring, you can also get aggregated data across groups of similar instances.

 

13. Aggregate statistics are available for the instances that have detailed monitoring enabled. Instances that use basic monitoring are not included in the aggregates.

a.    In addition, Amazon CloudWatch does not aggregate data across regions. Therefore, metrics are completely separate between regions.

b.    Because no dimension is specified, CloudWatch returns statistics for all dimensions in the AWS/EC2 namespace.

c.    This technique for retrieving all dimensions across an AWS namespace does not work for custom namespaces that you publish to Amazon CloudWatch. With custom namespaces, you must specify the complete set of dimensions that are associated with any given data point to retrieve statistics that include the data point.

d.    You can aggregate statistics for the EC2 instances in an Auto Scaling group. Note that Amazon CloudWatch cannot aggregate data across regions. Metrics are completely separate between regions.

e.    After you launch an instance, you can open the Amazon EC2 console and view the monitoring graphs for an instance on the Monitoring tab. Each graph is based on one of the available Amazon EC2 metrics.

14. The following graphs are available:

·         Average CPU Utilization (Percent)

·         Average Disk Reads (Bytes)

·         Average Disk Writes (Bytes)

·         Maximum Network In (Bytes)

·         Maximum Network Out (Bytes)

·         Summary Disk Read Operations (Count)

·         Summary Disk Write Operations (Count)

·         Summary Status (Any)

·         Summary Status Instance (Count)

·         Summary Status System (Count)

15. You can create a CloudWatch alarm that monitors CloudWatch metrics for one of your instances. CloudWatch will automatically send you a notification when the metric reaches a threshold you specify. You can create a CloudWatch alarm using the Amazon EC2 console, or using the more advanced options provided by the CloudWatch console.

16. You can use the stop or terminate actions to help you save money when you no longer need an instance to be running. You can use the reboot and recover actions to automatically reboot those instances or recover them onto new hardware if a system impairment occurs.

17. You can add the stop, terminate, reboot, or recover actions to any alarm that is set on an Amazon EC2 per-instance metric, including basic and detailed monitoring metrics provided by Amazon CloudWatch (in the AWS/EC2 namespace), as well as any custom metrics that include the “InstanceId=” dimension, as long as the InstanceId value refers to a valid running Amazon EC2 instance.

18. If you want to use an IAM role to stop, terminate, or reboot an instance using an alarm action, you can only use the EC2ActionsAccess role. Other IAM roles are not supported. If you are using another IAM role, you cannot stop, terminate, or reboot the instance. However, you can still see the alarm state and perform any other actions such as Amazon SNS notifications or Auto Scaling policies.

19. If you are using temporary security credentials granted using the AWS Security Token Service (AWS STS), you cannot recover an Amazon EC2 instance using alarm actions.

20. You can create an alarm that stops an Amazon EC2 instance when a certain threshold has been met. For example, you may run development or test instances and occasionally forget to shut them off. You can create an alarm that is triggered when the average CPU utilization percentage has been lower than 10 percent for 24 hours, signaling that it is idle and no longer in use. You can adjust the threshold, duration, and period to suit your needs, plus you can add an Amazon Simple Notification Service (Amazon SNS) notification, so that you will receive an email when the alarm is triggered.

21. You can create an alarm that terminates an EC2 instance automatically when a certain threshold has been met (as long as termination protection is not enabled for the instance). For example, you might want to terminate an instance when it has completed its work, and you don’t need the instance again. If you might want to use the instance later, you should stop the instance instead of terminating it.

22. You can create an Amazon CloudWatch alarm that monitors an Amazon EC2 instance and automatically reboots the instance. The reboot alarm action is recommended for Instance Health Check failures (as opposed to the recover alarm action, which is suited for System Health Check failures). An instance reboot is equivalent to an operating system reboot. In most cases, it takes only a few minutes to reboot your instance. When you reboot an instance, it remains on the same physical host, so your instance keeps its public DNS name, private IP address, and any data on its instance store volumes.

23. Rebooting an instance doesn't start a new instance billing hour, unlike stopping and restarting your instance. 

24. You can create an Amazon CloudWatch alarm that monitors an Amazon EC2 instance and automatically recovers the instance if it becomes impaired due to an underlying hardware failure or a problem that requires AWS involvement to repair. Terminated instances cannot be recovered. A recovered instance is identical to the original instance, including the instance ID, private IP addresses, Elastic IP addresses, and all instance metadata.

25. When the StatusCheckFailed_System alarm is triggered, and the recover action is initiated, you will be notified by the Amazon SNS topic that you chose when you created the alarm and associated the recover action. During instance recovery, the instance is migrated during an instance reboot, and any data that is in-memory is lost. When the process is complete, information is published to the SNS topic you've configured for the alarm. Anyone who is subscribed to this SNS topic will receive an email notification that includes the status of the recovery attempt and any further instructions. You will notice an instance reboot on the recovered instance. If your instance has a public IP address, it retains the public IP address after recovery.

26. To avoid a race condition between the reboot and recover actions, we recommend that you set the alarm threshold to 2 for 1 minute when creating alarms that recover an Amazon EC2 instance.

27. You can view alarm and action history in the Amazon CloudWatch console. Amazon CloudWatch keeps the last two weeks' worth of alarm and action history.

28. The MemoryUtilization metric is a custom metric (By default it is not available). In order to use the MemoryUtilization metric, you must install the Perl scripts for Linux instances.