Location: San Francisco, CA, Walnut Creek, CA or New York, NY
Type: Contract to Hire
The Problem and Incident Analyst manages critical technology incidents to restore service quickly following an outage. The Problem and Incident Manager performs Root Cause Analysis on recent critical incidents to identify recommendations to prevent or better respond to critical incidents.
- Conduct activity to restore service of critical incidents by reacting with ownership and urgency, leading cross-functional teams of to diagnose, troubleshoot and resolve service impacting incidents.
- Facilitate meetings and bridge calls.
- Document incident events, resolution and follow up action items and ensure accuracy of data in the Critical incident reports including well-written executive summaries.
- Communicate incident status in a calm, clear, accurate and concise manner with colleagues, management and clients.
- Facilitate Root Cause Analysis (RCA) process with subject matter experts, colleagues, leaders and clients to identify causes of critical incidents by leading and participating in post-mortem efforts for outages. Document results of RCAs in a structured format, including both problems and recommendations for improvements.
- Assist in a team rotation for facilitation of the Daily Operations Meeting, designed to review the previous day’s events within bank operations.
- Consistently review the problem management process to ensure effective execution of the process is upheld.
- Work with other infrastructure teams to help with internal documentation and process enhancement ensuring that all groups are working at an optimum level in terms of major Incident and problem management.
- Work with management and ITIL consultant to build and continually enhance the service management process across the organization.
- Work in an on-call rotation supporting the Major incident process that would require after hours 24/7 support.
- Perform duties and responsibilities specific to department functions and activities.
- Perform other duties and responsibilities as required or assigned by supervisor.
- Formal Root Cause Analysis Training.
- Experience in a wide variety of IT disciplines, including networking, servers, storage, data management, operations, disaster recovery.
- Minimum of 5 years IT problem management experience in an enterprise environment.
- ITIL v3 Foundation certification.
- Excellent analytical and problem solving skills.
- Demonstrated ability to quickly understand complex systems.
- Ability to simultaneously work on multiple projects in a fast-paced environment.
- Strong verbal and written communications.
- Experience facilitating discussions with individuals at all levels within an organization.
- Ability to build effective relationships and provide extraordinary service.
- Must be able to review and analyze data reports and manuals; must be computer proficient.
- Must be able to communicate effectively via telephone and in person.