I was at a great Meetup last night where one of the speakers was Jeffery Heer of the UW. His topic was data visualization. Apparently the Stanford Data Visualization Group moved to Seattle and is now the UW Interactive Lab. This talk reminded me of just how frustrating some of the problems I used to run into at work were because I couldn't find the right way to "see" the relationships. Now that I'm out here in the wild I find that there are entire swathes of tools being created which help engineers explore the relationships in the oceans of data I swam in. If I'd known about these guys then...
The mission of the Interactive Data Lab is to enhance people's ability to understand and communicate datathrough the design of new interactive systems for data visualization and analysis. We study the perceptual, cognitive and social factors affecting data analysis in order to improve the efficiency and scale at which expert analysts work, and to lower barriers for non-experts.
We work in near petabyte territory, if not petabyte, and we couldn't gain access to the tools which would keep us relevant in the industry. Access to even Excel's PowerPivot would have been tremendously helpful. As I've mentioned before, R or Python, would have made it easier to work through some of the datasets as well which exceeded the limitations of the Excel we did have.
It's quite frustrating going to these meetups where they're looking for speakers and there is not one single engineer / "scientist" from any of the wireless carriers is ever there to give a speech. Where are the wireless engineers to talk about the creation of performance KPIs in complex networks? Data modeling? Data wrangling? The quality of the data? How to manage missing data, alarm that data is not flowing? How the acquisition of data affects node performance? Managing monster sets of data? What visualization tools are the most useful with heterogeneous network statistics so relationships can be found? I've met recruiters from Century Link's Cloud Computing group, but that's not exactly a technical speech and they weren't engineers.
From what I've seen we've done a shitton more work with "big data" and predictive modeling and model analysis than most companies I see presenting here. People are calling data "big" when It's just the first time they've had access to the data stream, much less to the historical data (i.e., retention periods have been extended) so that they can work on predictions, yet the engineers and statisticians from a myriad of small companies are giving the speeches. Granted, these guys are real statisticians, PhDs., etc. but I'd like to see the theories applied to even more complex data sets. Then there's the Amazon, Google, Microsoft, Facebook speakers. They have some interesting talks, but again, so far I haven't heard speeches about performance from them in high traffic situations, or when they're questioned about metrics they have available to guarantee service, the list is not long, or summarized. They've got a minor fraction of what we were faced with in core engineering.
We can, and have done better. And while we don't willingly (any longer, but yes, I have) participate in making configuration & modeling changes on the fly to see the impacts, (i.e., live Ads would be an example) on our networks, we have to during FOA deployments. Gah! The hours and years I spent trying to get that data acquired prior to First Office and getting questioned as to its necessity. IT'S TO BE ABLE TO DETECT FAILURES AT LAUNCH!!! (ooo... I just got all shouty-cap. Deep breath. Deep breath.)
Relevancy. That's what is being lost by being siloed away from the dialogues going on in the outside world here. Wireless engineers and scientists should be participating in these forums, especially if the companies are interested in software defined networks on cloud networks using virtual machines. Mobility is a demanding technology and I am not hearing a thing about it and I think that is quite odd. When mobility hits "the cloud" space, especially if we don't own the cloud, it's gonna rain dollars and not in a good way - we'll be sending money out - again - without having the data to prove whether or not we're getting what we paid for. Tracing network failures will be problematic and determining cause will be even more complicated.
My non-scientific search of world-wide Meetups with the word "telecom" returned about 90 related meetups, 2 in Seattle (which I've promptly just joined). However, there are 90 meetups in Seattle for just analytics alone, not including things like "Big Data" or "Python", "Visualization," etc. The corporations have cut travel budgets, they silo their engineers, they want to outsource as much as they can and there are no public forums for discussing technical details except things like vendor specific user groups. They're actively creating a class of engineers who don't know much more than how to do data entry, or other repetitive work. They're to make decisions based upon what they're told, not upon what the inquiry into the data returns.
In the interest of capital efficiency, it all comes down to this: if wireless engineering / "data scientist" / "network planning" (whatever the hell you want to call yourself) types do not have access, and do not participate in this larger world of "big data" discussions the corporations will not get what they paid for, the networks will not be optimized, and worse - our (oh, I can't use that word any longer) customers will suffer. Wireless engineers have fiscal responsibility for hundreds (yeah, I know it's more) of millions of dollars of equipment. If we're not participating in this open-dialogue at this time, if we're not doing more with more data, then we're falling behind, becoming the irrelevant ones, and sure - anyone could do our job.
By the way, here's some more interesting information I came across from last night's talk:
imMens: Real-time Visual Querying of Big DataAnd someone sent me this article with a question of "new metrics?" in regards to cloud computing and virtual machines. Uhm. Yep. Just think about the complexity of trying to build a metric around The I/O Blender Effect. I'd be all over "5 minute data is too high a granularity in measurement interval... I want 1!" and I'd lose.
No comments:
Post a Comment