Home Artificial Intelligence Shining light on transferrable skills on your data science journey

Shining light on transferrable skills on your data science journey

0
Shining light on transferrable skills on your data science journey

Beam shape image (captured by the writer)

Preface

I’ve spent 5 years working as a researcher grinding laser physics, nonlinear optics, and solid-state laser engineering. While being fully submerged in the sphere, and enthusiastic about what I’ve been doing, sooner or later I made a transition into the industrial data science industry.

After working in data science for extra 6 years I actually have an impression that the skill set that I developed within the applied physics field has an ideal use in working on industrial projects that should not in any respect related to laser physics.

Plenty has been written about how useful academic experience could be, but I made a decision to precise my personal opinion on the topic.

To make my point I’ve decided to rate each skillset group based on how useful it’s and why.

Who’s this text for?

I believe I wrote it mostly for the people serious about the transition from the tutorial environment into the industrial field, but additionally for myself, to reflect on the intersection of tools, skills, and mindsets between the 2 fields.

Experience with literature review → 7/10

Why is literature review such a terrific and transferrable skill (habit) for industrial data science?

Literature review back in my physics days (writer’s desk)

For my part, a literature review is a bit ignored and misunderstood in industrial data science. And I’m not saying that we don’t read enough about brand-new model architectures and framework designs (this part is executed exceptionally well).

But in the case of getting more structured and helpful information as regards to the project quickly and effectively — that’s where the largest gap in the information science world exists in my view.

A literature review won’t even be the perfect term here. I could also call it background research, or state-of-the-art evaluation.

When coping with a business problem, in my view, it is crucial to get at the least some theoretical base as regards to your problem. What literature review does:

  • Forms a foundation for solid decisions on data strategy. Acquaint yourself with existing techniques and approaches within the domain field.
  • Hastens the onboarding process. In case you are latest to the domain you might be working on, getting knowledge on the topic as quickly as possible is step one for attending to value generation.
  • Improves communication quality with experts in the sphere. Domain experts, also called material experts are invaluable for solving data problems. But they typically don’t program, and so they are pretty busy. Thus data scientists must acquire some understanding of the domain-specific terminology and ideas to speak effectively and collaborate seamlessly with these experts.
  • Drastically improves the standard of your insights. In my experience, a literature review adds to a foundation for decision-making about data collection, preprocessing, modeling, and evaluation, ultimately improving the standard of the insights you deliver. In my experience, it really works, but not all the time.

Listening to a literature review, and investing effort and time into it, embodies a selected mindset — open-minded, humble, and inquisitive. A literature review helps with keeping you away from reinventing the wheel or the trap of confirmation bias.

I imagine that the means of a literature review will change with the expansion of enormous language models and services based on them, but we should not there yet.

Journaling→ 9/10

Transferring journaling practices from academia to industrial data science has been very rewarding for me. Behind multiple practical advantages, it gives you a priceless sense of continuity when going through ups and downs within the work lifetime of a researcher. For my part, by adopting the keystone habit of maintaining a lab notebook, data scientists can easily track their experiments, jot down ideas and observations, and monitor their personal and skilled growth. I wrote an entire separate piece on why it’s such a terrific idea to achieve this, be happy to examine it out!

Knowledge of programming → 6/10

In my scientific journey, I’ve been working on experimental data processing, numerical simulations, and statistical learning on an on a regular basis basis. Programming was also essential for developing and testing latest laser designs before testing physical prototypes (numerical simulations).

I’ve used it consistently for typical data science stuff:

  • experimental data processing (Python, Wolfram)
  • numerical simulations (Wolfram, Matlab, Python)
  • statistical learning (Wolfram, Matlab, Python)
  • data visualization (Origin Pro, Python, R)
My “working with data” scientific stack

Wolfram (Wolfram Mathematica more specifically) was essentially the most heavily-used tool because we had a license for it within the lab. It had a terrific toolset for solving non-linear differential equations, and we were widely using it for numerical simulations.

Python was a tool of alternative for me to wrangle data generated during experiments (beam shapes, oscillograms).

In relation to data visualization, Origin was the first tool since it allowed embedding of visuals into text documents while keeping them editable. Line charts, histograms (including kernel density estimators), regression evaluation — Origin was a terrific tool. Origin has a GUI, so it just isn’t even about coding, I just must mention it to be certain Python and R don’t get all data viz. credit.

Normally, I had a solid experience with each of the tools mentioned above: I do know the syntax and I can solve problems with decent efficiency. So why just 6/10? Why are programming skills gained in academia relatively low-transferrable into industrial data science? That could be a pretty strong statement, but I believe the downsides of educational experience may outweigh the upsides. Mainly because good software practices are completely neglected in lots of scientific environments.

Caveat: this statement is predicated on my personal experience of working in applied physics field, and definitely don’t apply to everyone working in academia. Take every thing from this section with a grain of salt!

On one hand, neglecting good software principles is a natural consequence of researchers optimizing for speed of research and variety of publications, not for code quality and maintainability. Then again, there are almost no people coming from proper software development to academia (for financial reasons), thus there isn’t any real production expertise in the primary place. I must also mention that working on designing experiments, doing a literature review, collecting measurements, writing code to process them, and getting helpful insights — all at the identical time is exhaustive. As a consequence, you only don’t have enough resources to check software development.

Proficiency in conducting measurements→ 9/10

This one is difficult to elucidate, so bear with me. Measuring stuff in applied laser physics is a discipline of its own. Delivering helpful measurements is a skill that takes years to coach! There are lots of reasons for that: you’ve gotten to know the physics of the method, follow measurement protocol and have specialized knowledge and training to operate complex and expensive instrumentation.

For instance, I’ve been working with diode-pumped pulsed solid-state lasers, measuring multiple parameters of the laser beam: pulse duration, pulse energy, repetition rate, beam profile, divergence, polarization, spectral content, temporal profile, and beam waist. Doing any of those measurements is so rattling difficult. As an example, you need to measure the beam profile (see the image below).

beam profiles 3d (captured by the writer)

Beam profile refers back to the spatial distribution of the laser beam’s intensity across its cross-section or transverse plane.

In theory, you only direct a laser beam to a CCD camera and get your beam shape in seconds. But doing it on the bottom is an entire different story. In case you are working with a pulsed solid-state laser with an honest pulse energy, and you realize what you might be doing, you’ll direct a laser beam to the high-quality optical wedge to get most of the heartbeat energy right into a trap and work with a mirrored image of a beam that has only a fraction of the energy of the unique beam. You’ll achieve this to guard the CCD camera from a disaster. But using a wedge will probably be not enough. You’ll install an adjustable beam attenuator, lock it into the darkest mode after which step by step lower the absorption rate until you get the right exposure in your CCD camera.

In case you are working with an infrared laser that’s invisible to the human eye, you might be faced with an issue: you’ve gotten to steer the beam through small apertures without seeing the actual beam. This skill alone can only be acquired through training and practice. By the way in which, each step of beam manipulation must be done with extreme care resulting from the security regulations: you’ve gotten to wear appropriate protective goggles, use protective screens, etc.

Okay, moving on, now your beam is attenuated and sits nicely on the CCD camera. But you continue to have plenty to do: wire the CCD camera to the laser power unit to attain synchronization and produce a stable image. In case you’ve done every thing accurately — you get your images. Wait, images?

beam profiles 2nd (captured by the writer)

You then realize that in case your laser operates at a pulse repetition rate of fifty Hz, that signifies that it produces 50 pulses a second. Each produced pulse may need a rather different beam profile. How do you produce the result? Must you just pick a random shot and capture the image? Or must you produce the common image using a certain variety of pulses? Oh, the averaging was enabled by default by the software managing the CCD camera?

Let’s wrap this “measuring beam shape” nonsense up. From all of the measurements I did in my life, I actually have 2 key transferrable qualities: it’s vigilance (NEVER take anything at face value) and meticulous attention to metadata (how exactly data was measured or recorded, which tools were used, and even why it happened in the primary place). Each are golden in the case of working with real-life data. Since it permits you to be far more efficient in producing the actual impact without entering into the rabbit holes. And that’s something that’s valued each in academia and in industrial data science.

Data Communication Proficiency → 10/10

While I used to be in academia, I didn’t consider data communication to be a very noteworthy or helpful topic to jot down about. Working on data visualizations, chatting about data and theories, and writing scientific papers were just a part of the job. But after years of doing research, you gain a solid skill set in data communication on different levels (each formal and informal).

Writing a scientific paper is considered one of the tougher skills to acquire amongst formal data communication types. It takes numerous practice to give you the chance to compose a compelling piece that has a correct structure (abstract → intro → literature review → methodology → results → discussion → conclusion → acknowledgments). The structure of the article itself presumes that you’ve gotten a story to jot down about. And it just isn’t nearly writing: you’ve gotten to know your way around producing compelling and purposeful visual representations of information. All to get your message to the audience.

I rate this skill as a ten out of 10 transferability because industrial data science unsurprisingly relies on interactions between humans, communicating your thoughts and results.

Conclusion

Overall, I imagine that those with a scientific background can bring unique perspectives and helpful skills to the sphere of information science. To those in academia who imagine that transitioning to a profession in industrial data science means abandoning all their exertions and expertise, I offer a distinct perspective: you’ve gotten a wealth of value to bring to the table. For my part, the perfect plan of action is to leverage your existing skills while picking up latest techniques and best practices of the sphere you transition into (all of us realize it is a lifelong journey).

LEAVE A REPLY

Please enter your comment!
Please enter your name here