Search

Systems Engineer II

companyLaine Recruiting
locationRochester, NY, USA
PublishedPublished: 6/14/2022
Full Time

Laine Recruiting has been engaged by one of the most respected higher education institutions and largest employers in the Rochester area! We've partnered to fill several roles within their Center for Integrated Research Computing (CIRC). This team provides hardware, software, training and support to over 110 departments across the organization.

About the Role

The System Engineer II shares responsibility for management and administration of advanced on-premise and cloud-based computing, networking and storage for research. In addition to administering servers and virtual systems, the position requires specialized skills for managing advanced computing architectures (e.g. high-performance computing systems and accelerators such as GPUs), specialized high-speed network technology (e.g. low-latency InfiniBand networks and research cluster topologies), and massive parallel file systems engineered for high-volume and high-velocity data for research. Responsible for assisting in the deployment and management of specialized tools for configuring and controlling research computing environments (e.g. SLURM, service nodes, etc.). Responsible for the shared design, setup, and maintenance of a University-wide research computing infrastructure with monitoring and security, and assists with communicating about advanced research technology solutions with faculty from several departments, centers, and schools.

Overview of Responsibilities

  • Analyzes research use case requirements, designs solutions, and deploys on premise or cloud based advanced infrastructure for University-wide research computing that involves application areas of existing and emerging areas of high performance computing, including artificial intelligence, big data, modeling, and simulation.
  • Designs, develops, deploys, and maintains systems that require creative assembly of specialized computing (e.g. GPUs and accelerators), advanced network (e.g. InfiniBand) and parallel file system (e.g. GPFS) infrastructure. Leverages specialized infrastructure to deploy research computing solutions that are typically unconventional or unorthodox in traditional information technology environments.
  • Accomplishes proactive performance monitoring of high performance computing and supporting resources, including the analysis, alerting, reporting, and tuning of computational accelerators, high-bandwidth and low-latency networks, parallel file systems, and scheduling/resource management software. Provides capacity analysis, maintenance and troubleshooting activities for advanced infrastructure in the research computing environment. Thinks creatively to respond to performance issues, system errors, and maintenance that are outside of standard information technology process controls and procedures.
  • Creates, reviews, and maintains technical documentation including solution designs and reference guides for institutional-wide research computing infrastructure. Prepares necessary paperwork and documentation to ensure compliance to federal funding agency standards and procedures. Participates in discussions of new products and services to enhance the delivery of research computing infrastructure; engages vendors as appropriate.
  • Maintains a broad knowledge of advanced technology, specialized equipment, and security and research compliance requirements. Remains mindful and vigilant of risks to the research enterprise while consulting with faculty, performing work, and planning activities.


Qualifications

  • 5+ years of technical experience
  • Knowledge of high-performance computing hardware and software
  • Experience with research networking and research storage solutions
  • Expertise with virtualization technology and cloud providers such as Amazon Web Services and Microsoft Azure
  • Excellent verbal and written communications skills and exemplary speaking and presentation skills, as well as the ability to interact with faculty and staff, as appropriate, to communicate, and to process communications from others on technical change and research computing solution recommendations
  • Ability to provide on-call support as required, as well as ability to perform after-hours and weekend maintenance and implementation activities
  • Ability to travel to and from Data Center facilities
  • Strong scripting skills
  • Expertise with server operating systems and orchestration and automation tools
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...